Are you still breaking your head to know what DB clustering is? Here is the article, which will help you to protect your head :-).
When I started working on database clustering, there were lots of questions in my mind like – “Do I need to have multiple databases?”, “How and who will take care of data synchronization between different databases?”, “How will the client communicate with the DB server in a cluster?”… etc. etc. I found scarcity of information and lots of ambiguous information on it. So here I’m making an attempt to present the abstract information, which will be useful to understand database clustering. I’ll be also presenting some of the critical issues, limitations and points to be noted to implement SQL Server DB clustering. This article will not provide any information regarding how to configure the cluster.
What is database clustering?
Not only performance and scalability are important for any application but also higher availability of the application. Especially applications like banking, trading, health care etc. where financial transactions and availability of information at right time are extremely important, the availability gets higher priority than performance and scalability. “Database clustering is the way of achieving higher availability of database”. Remember, DB clustering is a failover technology and not scale-out technology. It does not focus on performance or distributing the traffic to different servers. It merely focuses on moving the execution environment and services to another server in a cluster in case of server failure and thus ensures higher availability of database. DB clustering currently is supported only by SQL Server 2000 Enterprise edition.
- Cluster: Cluster is a group of computers configured to work together to serve clients in a similar fashion.
- Nodes: Each server participating in a cluster is called as node.
- Max nodes in a cluster: In Windows Server 2003, the maximum number of nodes can be 4 (with at the most 16 SQL instances) whereas in case of Windows NT, maximum number of nodes can be 2 in a cluster.
- Heartbeats: The nodes in a cluster remain in constant communication through the exchange of periodic messages, called heartbeats.
- Same OS on Servers: A cluster cannot be made of nodes running Windows Server 2003 Enterprise Edition and Windows Server 2003 Datacenter Edition. The OS installed on all the nodes must be same.
- Virtual IP (VIP): The client system communicates with the DB server using virtual IP. MSCS takes care of redirecting the client request to the active server and hence the client doesn’t have to worry about which server in a cluster is active.
Hardware / Software Requirements:
- At least two servers from Windows Server family. Servers can be Windows NT Server (with min. of SP-5), Windows 2000 Server, Windows 2000 Advanced Server or Windows Server 2003 (Enterprise or Datacenter edition).
- Two Network Adapters per server (one to communicate with the client system and another to communicate with the other servers in the cluster).
- Shared Disk Array/SAN/SCSI device (Small Computer System Interface, SCSI pronounced as "Scuzzy").
- MSCS (Microsoft Cluster Service) comes with the OS.
- SQL Server 2000 Enterprise Edition.
How clustering is achieved:
With the launch of MSCS, it is pretty easy to configure a DB cluster. Note that MSCS service is not specifically for DB clustering, but it is potentially used for DB clustering. MSCS provides three models:
- Single node server cluster (used to organize resources on a server for administrative conveyance).
- Single quorum (it is mostly used for DB clustering): This maintains the cluster configuration data on a single cluster storage device connected to all nodes. For n-node clusters, the cluster is active until the last node in a cluster is working. Data is stored on a single cluster storage device (SCSI etc.). So data synchronization is not required.
- Majority node set: Each node maintains its own copy of the cluster configuration data. More than half the nodes in a cluster must be running to keep the cluster working. This configuration is useful if you need to host applications that can failover, but where there is some other, application-specific way, to replicate or mirror data between nodes.
Note that in Single Quorum model, there will be only one copy of the database stored on a special kind of hardware and hence issues like data synchronization will never come. I prefer “Single Quorum” model for DB clustering.
The detailed description of each of the models is out of the scope for this article.
Diagrams are my best friends :-). Below is the diagrammatic representation of the two nodes cluster. The idea is to host the DB execution environment on different servers with the same hardware configuration and to keep the data on a separate special kind of hardware.
Note the below points regarding the diagram:
- Here, DB Server-1 and DB Server-2 host the DB execution environments after SQL Server installation.
- DB Server-1 and DB Server-2 should have same hardware configurations.
- Shared Device is a special kind of hardware (either SCSI device or Shared Array or SAN device) which is used to hold the data.
The Shared device is manufactured and tuned for faster service of data to the DB servers and provides almost 99.99 % availability.
- In the above diagram, the servers are configured in Active/Passive mode. This can also be configured in Active/Active mode. This point will be discussed further down the document.
- MSCS (Microsoft Cluster Service) was originally introduced in Windows® NT 4™.
- Client system communicates with the server using Virtual IP (VIP). VIP is the common IP configured for all the nodes in a cluster, so MSCS takes care of redirecting the client requests to the active server and clients don’t have to worry about which DB server is serving their request.
- There is a special DB installation procedure to be followed when you plan to go for SQL DB clustering. I recommend installing the OS on all the servers, configuring the adapters and Single Quorum cluster, and then installing SQL Server 2000. This way SQL Server automatically detects the clustering environment.
Let's see what are Active/Passive and Active/Active modes.
Clusters can be configured in two modes as below:
Active/Passive mode: In this mode, only the active server serves the client requests. One instance of SQL Server is installed on both the server systems. In this mode, the second server in the cluster is configured as passive node and becomes active node only in cases where the first server fails. Note that only one SQL Server license is required in this mode.
Active/Active mode: In this mode, both the servers serve the client requests. The MSCS arbitrarily chooses one of the servers to serve the client requests. If any of the active nodes fails, its resources are moved to another active node. To configure this mode, one SQL Server license per server is required.
SQL Server Instances:
You can run multiple instances of SQL Server 2000 on a single computer, which makes cluster configuration more interesting. If your organization has a large number of applications running on a server or your application has more than one database, you can install multiple instances of SQL Server 2000 on a single computer. As I said earlier, in active/passive mode, the active instance of SQL Server is owned by a single server node. A multi-instance server cluster can have at the most 16 instances of SQL Server and is useful in case where even if one instance fails, the other instances serve the application.
Other Useful Links: