What are the basic pieces required for Nutanix Virtualization?

It is important, before we dive too deep, to understand a key thing about Nutanix. That is, Nutanix is, primarily, at its core, a storage product. Nutanix was started as a storage company, and their storage is what put them on the map. Nutanix has taken that storage and made it a key element of their own hyperconverged platform by building their own hypervisor called AHV and their own network virtualization software called Flow. Well, kind of. AHV and Flow heavily leverage existing open source software as their base. It is worth knowing that Nutanix can leverage ESXi or Hyper-V as the hypervisor as well, but we will be focusing exclusively on the Nutanix AHV offering as that is what you would be using with Flow Virtual Networking.

Compute – AOS + AHV + CVM

A Nutanix server, referred to as a node, runs Acropolis Operating System (AOS) as the operating system. As of AOS 6.8, it is based on Rocky Linux 8. Prior versions of AOS were based off of CentOS. Acropolis Hypervisor (AHV) is a KVM-based hypervisor that runs within AOS. 

I can hear you now. “Wait, it’s just KVM?” Well, yes, but also no. AHV is KVM-based, but what makes it AHV is the storage implementation, called the Distributed Storage Fabric. We’ll talk about that next, but first we should understand the KVM part.

Image courtesy of The Nutanix Bible

There are three main components of KVM, pictured to the right courtesy of the Nutanix Bible. The first is KVM-kmod, the KVM kernel module that enables and exposes hardware-assisted virtualization, turning AOS into a type-1 hypervisor. The second piece is qemu-kvm. QEMU is the hardware emulation/virtualization software that runs in userspace. QEMU by itself is software-based virtualization (you can run it on your own workstation!), but the qemu-kvm implementation leverages the KVM kernel module to run virtual machines in a hardware-assisted manner. Each VM runs as a unique qemu-kvm process The third piece is libvirtd, the management tool and service for managing KVM and QEMU. All communication between AOS and KVM/QEMU happens through libvirtd.  QEMU, KVM, and Libvirtd are all open-source projects with more documentation than you could ever want or need. 

The first and most important VM on a Nutanix node is called the Controller Virtual Machine (CVM). The CVM is responsible for Storage I/O (including data transformation such a deduplication and compression), presenting the UI and API, managing upgrades, and managing DR & replication. The disk controller in the hypervisor is given to the CVM, and the CVM manages all the storage and processes all disk I/O for the VMs. 

Nutanix nodes are grouped in clusters. There are three main types of clusters: Single-node, Two-node, and Three+ Node.  Single- and Two-Node clusters are meant for ROBO deployments, do not support all features, and cannot be expanded. Three-node clusters can support all features and can be expanded. Lightedge will be deploying three+ node clusters.

Management – Prism

There are two main components of the Prism architecture. The first is Prism Element (PE). Prism Element is the local cluster manager that runs directly on the CVMs and handles all local cluster management and operation. Each CVM runs Prism Element, and they all share a VIP to make reaching PE easier. The second component is Prism Central (PC.) PC provides advanced functionality, such as IAM, software defined networking, and multi-cluster management. That is, a client with multiple clusters could have them all managed by a single PC. Prism Central can be deployed as a stand-alone VM, or in a cluster of 3 VMs for resiliency, as well as in different sizes based on the number of VMs being managed. Certain Prism Central features, such as Flow Networking, will require that additional resources (vCPU/RAM) be added to the PC VMs. 

Storage – Distributed Storage Fabric

When Nutanix nodes are clustered, all of the CVMs form what is called the Distributed Storage Fabric and provides the storage for the cluster. The topic of storage alone is worth its own training class and will not be covered in depth here beyond the basics.

CVMs pool physical disks together in a storage pool which can span multiple nodes. In most instances, there only needs to be a single storage pool. A storage pool can be logically segmented into storage containers. Storage containers are very analogous to datastores in ESXi. Within the storage container are vDisks, which represent VM hard disks (similar to VMDKs.) This is where this class will stop on the storage front. There are even more granular elements called vBlocksExtents, and Extent Groups, and a whole gaggle of storage processes to handle replication, tiering, caching, and more. The Nutanix Bible’s Book of Storage is an excellent starting place for learning more if you so choose.

One key to note: Nutanix does NOT leverage external storage. You cannot connect a SAN to a Nutanix infrastructure and put your VMs on the SAN. Remember, Nutanix is a storage product at its base, and leveraging external storage would sort of defeat the purpose.

Networking

Out of the box, Nutanix uses traditional VLAN-backed networking, nearly identical to VMWare. AHV uses Open vSwitch (OVS) as the platform for all networking. Per the Nutanix Bible, “OVS is an open-source network switch implemented in the Linux kernel and designed to work in a multi-server virtualization environment.“ OVS supports many popular switching features such as VLAN tagging, LACP, port mirroring, Each individual hypervisor maintains an OVS instance, with constructs called bridges managing the switch instances on the AHV hosts, and bridges on multiple hosts can be aggregated into virtual switches. Bridges and virtual switches are divided into subnets.  

Bridges

Bridges are virtual switches that manage network traffic between physical and virtual network interfaces on a given host. A default OVS bridge, br0, is created on each host to manage all of the traffic for the host and VMs. A default Linux bridge (not managed by OVS but instead handled natively by the Linux OS) called virbr0 is also created on each host to manage local communication between the AHV host and the CVM. The host, VMs, and physical interfaces connect to the bridge by way of ports. Bridges on AHV/OVS are analogous to Virtual Switches on VMware.

Virtual Switches

Virtual switches are aggregates of the bridges on all the nodes in a cluster, allowing them to be managed together. A default virtual switch, vs0, is created to aggregate all the individual br0 bridges on the hosts in the cluster. Virtual Switches on AHV/OVS are analogous to Distributed Virtual Switches on VMware.

Ports

Ports are logical constructs in the bridge that represent switchports. There are four main types of port. An internal portprovides access from the AHV host to the bridge. A tap port provides access for virtual NICs presented to VMs. Bonded ports provide NIC teaming for the physical interfaces on the AHV host, and VXLAN ports are used for the IP address management functionality built into AHV.

Subnets

Layer-2 segments are called subnets. A subnet is defined on a given virtual switch by name and VLAN tag. IP address management can be enabled, allowing OVS to handle IP address assignment and tracking throughout a subnet. This IP address management functionality leverages VXLAN ports between the bridges to facilitate the required traffic.

MAC Addresses

By default, Nutanix leverages the OUI 50:6B:8D when assigning MAC addresses to VMs. MACs are guaranteed to be unique within a cluster.

Networking for an AHV Host

Organization

Clusters

As mentioned, Nutanix nodes form clusters. Clusters are groups of nodes pooled together to leverage the aggregate sum of their resources.

Projects

Within a Prism Central-managed cluster, Projects act as tenants and provide a sort of multitenancy. This multitenancy is more aligned to business units within a company as opposed to the sort that would allow two organizations to share resources. 

Permissions & Role-Based Access Control

Prism Element by itself can use Active Directory or OpenLDAP to provide authentication. Prism Central provides additional capabilities, including SAML and OpenID. Both can tie to multiple identity providers. Users can access Prism Element proxied through Prism Central without needing credentials on both, meaning Prism Element doesn’t need anything beyond the local administrator used in case of issues on Prism Central.

Further reading

As I’ve already referenced a number of times, Nutanix has a very thorough resource called the Nutanix Bible which contains a wealth of information. There is also the Nutanix AHV Networking Overview. I would advise you keep both handy as you familiarize yourself with Nutanix. Both were leveraged extensively as this document was created.