"vSphere HA virtual machine failed to failover" error in vCenter Server問題分析

迷倪小魏發表於2017-12-20

"vSphere HA virtual machine failed to failover" error in vCenter Server (2034571)

      今天遇到這樣一個問題:在VMware虛擬化中,出現“vSphere HA虛擬機器裝置故障切換失敗”的告警,但是我的虛擬機器正常執行,下面給我VMware官方給出產生該問題的原因與解決方法:


Document Id


2034571


Symptoms


·         In a cluster with Isolation response set to Leave powered on when a host becomes isolated may display this error on a virtual machine.

vSphere HA virtual machine failed to failover

·         The virtual machine continues to run without a problem.


Purpose


This article provides information to:

·         Clear the vSphere HA virtual machine failed to failover error from the virtual machine.

·         Deal with the vSphere HA virtual machine failed to failover error if occurs.

·         Reduce the occurrence of the vSphere HA virtual machine failed to failover error.


Cause


This behavior can occur whenever a High Availability master agent declares a host dead. However, the virtual machines continue to run without incident. This alarm does not mean HA has failed or stopped working. When this alarm is triggered, it means that one or more virtual machines failed to get powered on by a host in a cluster protected by HA.


Possible reasons for this to happen:

 

·         The host is still running but has disconnected from the network. The cluster's host isolation response is set to Leave powered on:

·         When a host becomes network isolated, the remaining hosts in the cluster do not know if the host has crashed, or is just disconnected from the network. As a result, the remaining hosts attempt to power up the virtual machines that were last logged as running on the isolated host. With Leave powered on, the host that became network isolated will leave the virtual machines up and running and not attempt to power them down, thus keeping the locks on the files. With the isolated host locking the files, the remaining hosts will fail to perform the power on task on the virtual machines resulting in the alarm triggering.

·         The host is still running but has disconnected from the network. The cluster's host isolation response is set to Shut down or Power off:

·         With this host isolation response, a host will attempt to send shut down or power off commands to its running virtual machines when it recognizes it is isolated. Once a virtual machine is completely shut down, and the original isolated host no longer has locks on the virtual machines files, the remaining hosts in the cluster will be able to obtain the locks necessary to power up the virtual machines. If the virtual machine is not successfully shut down, or the locks are not released, then the alarm will be trigger.

·         The host has failed and the virtual machine storage is in a degraded state. The remaining hosts in the cluster cannot contact the storage device and fail to power up the virtual machines, resulting in the alarm triggering.


Resolution


This is expected behavior in VMware vCenter Server 5.0.x, 5.1.x and 5.5.x. Because the virtual machines continue to run without incident, you can safely ignore this issue.

To clear the alarm from the virtual machine:

1.     Select virtual machine with the triggered alarm.

2.     Click on the Alarms tab and then the Triggered Alarms button.

3.     Right-click the vSphere HA virtual machine failover failed alarm and click Clear.


Note
: If this alarm is on multiple virtual machines, you may select the host, cluster, data center, or vCenter Server object in the left pane and continue with step 2 to clear the alarms with fewer steps.


For more information on dealing with alerts, see:

·         vCenter Server 5.0 - the Acknowledge Triggered Alarms section in the vSphere 5.0 Documentation Center.

·         vCenter Server 5.1 - the Acknowledge Triggered Alarms section in the .

·         vCenter Server 5.5 - the Acknowledge Triggered Alarms in the Sphere Web Client section in the .

To reduce the likelihood of this issue occurring:

·         Use multiple management networks. For more information, see .

·         Ensure the datastore heartbeats within vCenter Server are communicating properly for HA to run efficiently when management network problems occur.

For example, if using SAN and an IP-based storage, mount a couple of SAN-based datastores to the hosts in the cluster so that HA may use them instead of IP-based storage. Or, if only IP-based storage is used, consider fault isolating one or more of the networks used for storage from those used for the management network.




作者:SEian.G(苦練七十二變,笑對八十一難)

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/31015730/viewspace-2148948/,如需轉載,請註明出處,否則將追究法律責任。

相關文章