Blog Series: Metro – Breaking stuff 3: Failing DC, no witness

In the previous episode of breaking stuff we powered off all hosts in one of the datacenters… Not really eventful as we were leaning mostly (if not all) on VMware vSphere’s HA capabilities. Now let’s put thing up a notch… And fail an entire datacenter! In this episode we will break a datacenter while NOT having a witness service.

If you’re not the reading type you can of course skip straight to the VIDEO!

Failing a DC in a metro setup without witness

I will be failing the datacenter by using the #dell #iDrac out-of-band management modules for the hosts, and I will use a smart PDU to pull the power from both controllers of one of the #PowerStores. I tried to shut the PowerStore “nicely” but it wouldn’t let me; as both controllers are aware of each other’s status I did not find a way to tear it down from the GUI. So this is where the PDU comes in.

As we are performing this test without a witness, the storage layer take the basic action of disabling all non-preferred storage. This is because without the witness service, the surviving array cannot distinguish between interlink failure or storage array failure. Looking at our four workloads (Datacenter A on the left, Datacenter B on the right, preferred volumes top and non-preferred volumes down), I would expect that only the DCA-preferred VM would survive, and the non-preferred VM from DCB would get restarted on DCA.

But lets not spoiler too much, it is enough to say that without witness we would not get to a recovery scenario that is acceptable from an enterprise perspective. Good to know, in the next episode we will be doing this test over once again, but this time WITH the support of a witness service. For now, enjoy the video below:

3 thoughts on “Blog Series: Metro – Breaking stuff 3: Failing DC, no witness

Leave a Reply

Your email address will not be published. Required fields are marked *