About
Quorum
In short quorum is minimum number of votes
required for majority. the nodes participating in the windows cluster are
connected through a private network and communicate through User Datagram
Protocol (UDP) port 3343.The quorum configuration in a failover cluster
determines the number of failures (failure of nodes) that the cluster can
sustain while still remain online. If additional failure happened beyond this
threshold, the cluster will stop running.Quorum is designed to handle the Split
Brain scenario. When nodes are unable to communicate each other, each node
assume that, resource groups owned by other nodes have to brought online. When
same resource brought online on multiple nodes at the same time,data corruption
can occur. This scenario is called Split Brain.
Quorum role in cluster
When
both nodes of a cluster are up and running, participating in their relevant
roles (active and passive) they communicate with each other over the network.
For example, if you change a configuration setting on the active node, this
configuration change is automatically sent to the passive node and the same
change made. This generally occurs very quickly, and ensures that both nodes
are synchronized.
But,
as you might imagine, it is possible that you could make a change on the active
node, but before the change is sent over the network and the same change made
on the passive node (which will become the active node after the failover),
that the active node fails, and the change never gets to the passive node.
Depending on the nature of the change, this could cause problems, even causing
both nodes of the cluster to fail.
To
prevent this from happening, a SQL Server 2005 cluster uses what is called a
quorum, which is stored on the quorum drive of the shared array. A quorum is essentially
a log file, similar in concept to database logs. Its purpose is to record any
change made on the active node, and should any change recorded here not get to
the passive node because the active node has failed and cannot send the change
to the passive node over the network, then the passive node, when it becomes
the active node, can read the quorum file and find out what the change was, and
then make the change before it becomes the new active node.
In
order for this to work, the quorum file must reside on what is called the
quorum drive. A quorum drive is a logical drive on the shared array devoted to
the function of storing the quorum.
Quorum
models
Windows 2008 cluster supports four quorum models.
1 Node Majority
2 Node and Disk Majority
3 Node and File Share Majority
4 No Majority (disk only)
Node Majority: Node
majority option is recommended for cluster with odd number of nodes.This
configuration can handle a loss of half of the number of cluster nodes rounded
off downwards. For example , a five node cluster can handle failure of two
nodes. In this scenario three of the nodes (N1,N2,N3) can communicate each
other but other two(N4 and N5) are not able to communicate. The group
constituted by three node have the quorum (majority) and cluster will remain
active and cluster service will be stopped on the other two nodes (N4 and N5).
The resource group (sql server instance) hosted on that two nodes goes offline
and come online on one of the three nodes based on possible owner settings.
Node and Disk Majority: This
option is recommended for cluster with even number of nodes.In this
configuration every node gets one vote and witness disk (quorum disk) gets one
vote which makes total votes a odd number. The witness disk is a small ( approx
1 GB ) clustered disk.This disk is highly available and can failover between
nodes. It is considered as part of the cluster core resource group.In a four
node cluster, if there is a partition between two subsets of nodes, one of the
subset will have witness disk and that subset will have quorum and cluster will
remain online. This means that the cluster can lose any two voters,whether they
are two nodes or one node and the witness disk.
Node and File Share Majority: This
configuration is similar to the the Node and Disk Majority, but in this case
the witness disk is replaced with a file share which is also known as File
Share Witness Resource (FSW). This quorum configuration usually used in
multi-site clusters (nodes are in different physical location) or where there
is no common storage. The File Share Witness resource is a file share in any
server in the same active directory which all the cluster nodes have access to.
One of the node in the cluster will place a lock on the the file share to
consider that node as owner of the file share.When this node goes offline or
lost the connectivity another node grabs the lock and own the file share.On a
standalone sever, the file share is not highly available , however the file
share can also put on a clustered file share on an independent cluster,making
the FSW clustered and giving it the ability to fail over between node. It is
important that, this file share should not put in a node of the same cluster,
because losing that node would cause for loosing two votes. A FSW does not
store cluster configuration data like witness disk. It contain information
about which version of the cluster configuration database is most recent.
No Majority (Disk only) : This configuration was available in windows
server 2003 and has been maintained for compatibility reason and it is highly
recommended not to use this configuration. In this configuration,only witness
disk has a vote and there are no other voters in the cluster. That means if all
nodes are online and able to communicate , but when witness disk failed or
corrupted, the entire cluster will go offline.This is considered as single
point of failure.