Is VMware Private Cloud Highly Available?

Posted on April 9, 2021 by Isaac Davis

Home > Blog > Private Cloud > Is VMware Private Cloud Highly Available?

Wondering whether to purchase VMware to achieve high availability (HA)?

All production environments require constant uptime. How confident are you in the redundancy of your current traditional or cloud environment? In short, redundancy enables you to have confidence with your choice, where efficiency and manageability take you the rest of the way.

VMware’s vSphere High Availability (HA), vSphere vMotion, and the Distributed Resource Scheduler™ (DRS) make up the backbone of a high availability system for VMware Private Cloud.

These three technologies work in unison with the ESXi servers to allow you to restart virtual machines (VMs) automatically or move live workloads when:

Loads spike due to traffic
Issues occur due to hardware failure or security issues
Routine server maintenance is needed

Looking for a scalable, reliable, and highly available cloud solution? Download our 451 Research: Business Impact Brief and find out how Private Cloud positions SMBs for efficiency and growth.

What is a Virtual Machine?

A Virtual Machine (VM) represents an actual computer machine put together from hardware parts that provide computing resources. A VM has no direct connection with hardware resources. Instead, its relation is a hypervisor installed on an actual machine that hosts a VM.

This kind of setup gives you adjustable resource allocation and isolation from the rest of your environment. It also provides you with diversity and more straightforward arrangements when choosing an Operating System for your VM.

ESXi Hosts

VMware High Availability requires at least two physical servers that fulfill the role of ESXi (ESX Integrated) hosts. ESXi is the operating system installed on physical servers joined together into the vSphere Cluster. These servers act as nodes in a primary-primary relationship.

When multiple nodes come together, they create a cluster. Merely having nodes at your disposal is not enough; they have to be tied-up in some kind of relationship.

Primary-primary, or a multiple-primary relationship, allows data manipulation to any member of that relationship. This kind of configuration translates to having each link in the chain equally essential, and at the same time easily replaceable, because any other node in the cluster can pick up where the failing one left off.

Within the hosted cluster, you can share storage, CPU, memory, and network resources in the environment. These resources are used to provide the groundwork for a virtual machine that will be highly available.

This is the opposite of a dedicated server solution where resources are limited to just one physical server. Downtime is experienced due to unplanned hardware maintenance during events such as expanding resources, maintenance requirements, or other unexpected hardware failures. Without redundancy, the server must come offline during the hardware replacement.

So, what exactly is high availability and how does it work?

What is High Availability?

The underlying goal of high availability is to gather and apply resources and capabilities in such a manner as to abstract them into a combined environment, in this case, using VMware. Every high availability environment requires more than one component to ensure things work together to create redundancy.

One of the main factors in VMware vSphere is the Distributed Resource Scheduler. This feature prevents resource contention and optimizes VM placement on our hosts. Upon startup, DRS evaluates the hosts in our cluster and selects the one most suited to run that virtual machine.

An exception to resource management comes with data storage resources. Instead of relying on the host’s drives for your VM, Liquid Web uses NetApp Storage Area Network (SAN). SAN is a dedicated network of storage devices that gives you an advantage in having separate hardware for storage. This type of configuration singles out storage resources from our hosts, providing you with more redundancy.

How vSphere Handles Planned Maintenance

With vMotion, you can migrate your virtual machine with no downtime. This move includes altering the resources that your VM is using.

Live migration is performed without any interruption of the VM, meaning that it sees no change in its environment, removing reasons for experiencing downtime during hardware-related maintenance.

Maintenance is not the only example of the vMotion use case. Should one of the hosts become under or overutilized, DRS can move VMs to another host with vMotion to maintain balance in your cluster.

Also, Storage vMotion provides another layer of VM management when referring to allocating resources. Not only can you move your memory and CPU resources, but it also allows you to move the VM’s file system of a storage device to grant maintenance of the device without VM downtime.

How vSphere Handles Unanticipated Issues

To understand how vSphere HA handles unexpected failures, you must understand the relevance of VMs always running on a chosen host; a host that would be most fitted to house a VM in the meaning of resource distribution and maintaining the balance of the cluster.

Part of this election occurs when you install Fault Domain Manager (FDM) on each host. FDM is an agent that is used for communication between the hosts about their state, resource allocation, VM housing, and more.

The FDM agent is responsible for protecting against:

Host failures
VM failures
Application-level failures

Although it can protect from said failures, its primary purpose is to protect from host failures, since they form the foundation of your HA environment.”

In a scenario where a hardware failure affects the host running the VM, the VM is automatically restarted on the unaffected host and continues running similar to its previous state.

During this occurrence, the only downtime experienced is VM’s time to boot, or the time it takes from a power-off state to power on. This scenario gives you a better view of the relationship between your cluster nodes because instead of primary-replica, more precisely, they relate as primary-primary, since non-failing nodes take on the role of a primary node.

How the System Recognizes Failures

As a way of gathering and sending status information about parts of cluster infrastructure, the VMware tool Heartbeat acts as a path for status reports on the cluster. This feature handles the task of actually starting and stopping services in the desired manner.

Heartbeat must be paired with a Cluster Resource Manager, and the preferred one used in a high availability environment such as VMware is called Pacemaker. It contains the logic needed to ensure that services are running in only one location. Pacemaker is where you define what should be running on your infrastructure, and how it should keep running in a changing environment.

In our failure scenario, as mentioned earlier, Heartbeat would be responsible for recognizing and sending information about the failure and what steps should be taken to bring the cluster back to its initial, expected state. This enables you to isolate the parts of your infrastructure that require intervention to ensure your VM stays unaffected.

How VMware vSphere Creates High Availability

The vCenter Server application enables the vSphere infrastructure to manage the cluster from a centralized location. It is installed on each host within the cluster and acts as a point of administration both for hosts and VMs.

If you could split vSphere into two parts, it would be ESXi (the OS running on the hosts) and the vCenter Server.”

First, ESXi serves as a hypervisor and virtualization software that allows us to create our VMs. Simultaneously, the vCenter Server represents a virtual data center in which we can control our virtual environment.

VMware allows the vSphere Web Client access to our infrastructure through a web browser to make access easier. It is a web-based application whose requirements are commonly supported on most operating systems and web browsers.

What is the Difference Between Virtual Machines and Dedicated Servers?

In common use cases, a dedicated server would be the equivalent of having one VM in a private cloud with scalability limited to the hardware on the server. Using vSphere is much easier and quicker to use without incurring unnecessary downtime required for something like a memory upgrade on a dedicated server.

Any unplanned maintenance while using a dedicated server, such as hardware failure, would directly affect your production environment and cause downtime. With an HA private cloud, this would not occur, as your environment would be automatically brought back up on another host.

With vSphere, Liquid Web can easily distribute resources within the cluster to select several Virtual Machines, and select the VM operating system that best suits your needs.

Another way of looking at scalability is to expand our resources by simply adding more hosts to your cluster.

Invest in High Availability for Your Business With VMware Private Cloud

VMware Private Cloud is a highly adjustable and reliable solution that is cost-effective and allows you to scale with ease. Whether you are looking for a method to start up a cluster or a customized solution, a VMware Private Cloud is an investment made to last for your business.

Looking for a Scalable, Reliable, and Highly Available Cloud Solution? Download Our 451 Research: Business Impact Brief and Find Out How Private Cloud Positions SMBs for Efficiency and Growth.

451 Research Business Impact Brief Hosted Private Clouds - White Paper Banner

Tagged with: Uptime, Uptime & Performance, VMware