Tuesday, February 7, 2023

vSphere 7: Performing a Reconfigure for VMware HA operation on a primary node causes an unexpected virtual machine failover

After enable Skyline Health you may see error telling you something like this:



When you perform a Reconfigure for VMware HA operation on the primary node in an HA cluster, an unexpected virtual machine failover occurs for the virtual machines running on that primary node.







When the primary HA host is manually reconfigured for HA, it causes the remaining secondary host to enter an election to find a new primary host.

The newly elected primary host places the virtual machines that were running on the old primary host in an unknown power state, and waits for up to 10 seconds for notification that the virtual machines on the old primary host are powered on and running.

If the old primary host does not become secondary within that 10-second interval, the new primary host assumes that the virtual machines are down, and attempts to restart them. This causes a false failover event to occur, and consequently the failover task fails because the virtual machines were never powered off. The virtual machines remain unaffected in this scenario.

To resolve this issue, increase the monitor period:
Notes
  • Starting with vCenter Server 7.0 Update 1, the Property name for fdm.policy.unknownStateMonitorPeriod has changed to fdm.unknownStateMonitorPeriod.
  • The das.config can be prefixed to these properties, which when completed can apply to all the hosts in the cluster.
    1. In vCenter, right-click the cluster and select Edit Settings.



    










2. Click vSphere HA and then Advanced Options.



    

























3. Add a new option (if not already present)
        For 7.0U1 or greater:
            Default Option is 10
            das.config.fdm.unknownStateMonitorPeriod = 10
        Pre 7.0U1:
            das.config.fdm.policy.unknownStateMonitorPeriod = 10

        For this issue change the value from 10 to 30.

        For 7.0U1 or greater:
            das.config.fdm.unknownStateMonitorPeriod = 30
        Pre 7.0U1:
            das.config.fdm.policy.unknownStateMonitorPeriod = 30


    

























4. Disable and re-enable HA settings of the cluster.



Sunday, February 5, 2023

Vmware Tanzu Vanguard

Vmware Tanzu Vanguard


The VMware Tanzu Vanguard is a select group of active customers, cloud users, and practitioners that are passionate about our products and services. They openly share their experiences and knowledge with the community and the industry.

MEMBER BENEFITS

Access to VMware Tanzu product and service groups

Exclusive invitations to our conferences and user group meetups

Networking with a small group of peers to get complex IT challenges solved

Recognition and rewards for contributions and achievements

Digital Tanzu Vanguard badges

Community-branded swag

Detailed information about the program can be found at: https://tanzu.vmware.com/vanguard


Want to join?

Sign up today! https://tanzu.vmware.com/vanguard#join

When you do, please say that you come from me, @Fernando Perez