Prevent SQL Cluster Failover For Patching/Reboots (Powershell)

At my current shop we have almost 200 windows servers and VMs that are rebooted every 90 days as part of a regular windows patching process.  Each time we execute this process, our DBA team must validate all sql services, databases and clusters after the patching and reboot process is complete.  Our strong preference is to avoid any manual intervention by DBAs while still making sure that the reboot process doesn’t result in a clustered SQL Server instance being hosted by different node in the cluster than prior to the reboot.

To simplify this process we put together a pair of powershell scripts:

1) Pre Process – sets the possible owner of SQL Server network name cluster resources to only have their current hosting node as a possible owner (this will prevent the instance from failing over to another node a waves of reboots happen)

2)Post Process – sets the possible owners of the SQL Server clustered network name resource to all nodes in the cluster (our standard configuration for clustered sql instances).

We have a control table (and stored procedure) that is used to store the various Clustered names in our environment.  The powershell scripts adhere to this basic high level process flow:

  1. Open a result set of all sql cluster names
  2. Loop through the cluster names one at a time
    1. for each cluster name, pull all Network Name resources
      1. Loop through each network name
        1.  For each network name resource either set the possible owners to only the current owner (pre) or all nodes in the cluster (post)

Please note: the code must be run as a domain account that has the permissions to manage the cluster resources (in our case we use dedicated administrative level domain accounts when we run this).

First let’s look at the outer loop that gets a list of clusters:

prescript

The code first imports the FailoverClusters powershell module so that we can call cluster specific commands later in the code.  The remaining lines in the code above set up a new SQL Server connection and fill dataset with a list of clusters.  In the example above, I have a hardcoded, single cluster name but you could have a table of all your clusters and have them returned as a single column resultset.

The code below is the most important part that connects to the cluster, finds the network name resources and sets their possible owners.

prescriptloop

Our approach to prevent failovers of sql servers to other nodes in the clusters is to temporarily set the SQL Server network name resource to only have it’s current cluster node host as it’s only possible owner.  With this configuration in place on the network name resource, rebooting the host will NOT cause the clustered sql instance to fail over to a different node in the cluster.  The code above prints the following information:

1) The name of the cluster

2) List the current owner node for the network name

3) List all possible owner nodes in the cluster for the network name resource

4) Sets the possible owners for the network name to ONLY THE CURRENT cluster node (so after this runs, the network name resource, and all other sql related resources in the cluster group, will NOT fail over in the event of an owner node reboot or failure)

The POST version of the script simply sets the possible owners of the network name resources to ALL nodes in the cluster (our standard configuration).  The code below shows similar logic as the PRE script, but includes a different call to Set-ClusterOwnerNode in the powershell (to set all nodes as possible owners);

postscriptloop

 

Comment or email me if you need copies of the PS1 files for this.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s