Quantcast
Viewing latest article 8
Browse Latest Browse All 11

vSphere HA Enforces DRS Anti-Affinity Rules…yay.

UPDATE 11/22/2013: Added a correction, the option is for vSphere HA, not DRS, added some notes about the option and a link to the KB. -alex

…but it did before, right? I mean, I've seen a bunch of documentation (from VMware) and blog posts (from people who work or worked at VMware) that shows me how to "…ensure that affinity and anti-affinity rules are strictly applied…" All I have to do is set the advanced option in vSphere DRS ForceAffinePoweron to 1.

Yeah, those posts are half correct. The forceaffinepoweron advanced option only enforces strict affinity rules, it does nothing for anti-affinity rules. The best and worst case scenario I can paint is a two-node cluster which uses shared disks, let's say it's a SQL cluster. The vSphere cluster consists of three nodes, DRS enabled in fully automated mode. An anti-affinity rule is created to keep the cluster nodes separated. vSphere cluster node 1 experiences a failure causing SQL cluster node 1 to also fail. vSphere HA realizes the failure and powers on SQL cluster node 1 on vSphere cluster node 2, where it just so happens that SQL cluster node 2 is also running. This is perfectly acceptable (pre-vSphere 5.5) because vSphere HA does not respect DRS anti-affinity rules, even if the forceaffinepoweron value is set to 1. In most cases this isn't a huge issue since DRS will reevaluate the state of the cluster every 5 minutes and when it notices that there is a rule being violated it will recommend a move. With the cluster in fully automated mode the virtual machine can migrate using vMotion, however because this is a SQL shared-disk cluster where the virtual machines have SCSI bus sharing enabled, vMotion is not possible. Oops. Now the passive node must be shut down and re-powered on. DRS would then power up the virtual machine on vSphere cluster node 3, obeying the DRS anti-affinity rule.

To solve this problem, prior to vSphere 5.5, you had to create a couple of VM-to-Host hard affinity rules to keep one of the clustered virtual machines only running on a set of ESXi hosts, and the other clustered virtual machine on its own set of ESXi hosts.

Have you noticed all of those "pre-vsphere 5.5" and "prior to vSphere 5.5" I sprinkled throughout?

vSphere 5.5 introduces a new vSphere HA option, das.respectVmVmAntiAffinityRules. All the things that I said were bad earlier in this post are now remedied by this option. If you are still using shared-disk clusters, like SQL Server AlwaysOn Failover Cluster Instances or any Windows Failover Cluster with a quorum disk, do yourself a favor and set this option on the vSphere cluster.

Image may be NSFW.
Clik here to view.

A couple of notes about this option:

  • It must be set in the Advanced Options of vSphere HA.
  • When set to true, HA will respect anti-affinity rules even if DRS is disabled for the cluster.
  • If recovering a VM means a rule will be violated HA will issue an event reporting that insufficient resources are available to perform the failover.

VMware KB Advanced configuration options for VMware High Availability in vSphere 5.x (2033250): http://kb.vmware.com/kb/2033250

-alex


Viewing latest article 8
Browse Latest Browse All 11

Trending Articles