I see a lot of people at the moment looking at alternative ways to run Virtual Machines (VMs). Where VMs used to dominate, VMware vSphere was almost a no-brainer. As more people people start dipping their toes into containers, it starts to make more sense to run VMs on that same platform… If only that was possible… And it is: Enter KubeVirt.
But with KubeVirt on the rise, now I see many people who loose the idea of what is really happening under the covers. Are KubeVirt VMs “slower” because it runs VMs on top of a container platform? (Spoiler alert: They’re not đ)

KubeVirt described in a short and sweet way
The easiest to understand way I’ve seen KubeVirt VMs described (thank you Red Hat !) is actually pretty simple to understand:
“A VM is a process running on Linux”
“A container manages a process running on Linux”
And so one plus one makes two. So to make this clear once and for all, there is no “layering” of different hypervisors of any kind unless you’d run the k8s cluster itself virtualized (which would not make a lot of sense unless you’re playing in a lab). KubeVirt has all the tools to configure, build and run a container that manages a VM process. The upshot of this is that managing a VM becomes pretty much the same as managing pods / containers in k8s:

In this blog post I won’t be diving any deeper in the inner workings of KubeVirt (like describing libvirt and QEMU), but it is very much parallel to running a VM through KVM or OpenStack.
Adding storage and networking to a KubeVirt VM
As a KubeVirt VM is embedded in a container, it can (actually must) use any resources provided to that container. So virtual networking: Extends to your VMs. CSI Storage: Extends to your VMs… So adding storage to a VM is nothing more than adding a PVC to a k8s pod.
So yes, you can still use your CSI-driver enabled external storage array to claim volumes that will run your VMs. Some “limits” apply though as we will see next.
Live migration of VM workloads in KubeVirt
Can you do live migration of a VM in KubeVirt? The answer is a definitive YES. However there is also a “but”: In order to perform live migrations (which is what you want if you are serious about running VMs in production) you need RWX (multi-writer) supported storage.
The reason for this isn’t the fact that a migrating VM and its newly created twin want to perform writes to their disk in parallel; it is merely about the idea that both CAN write to the same volume so during the cut-over from original to new instance there is nothing specific to handle at the storage layer.
Lots of people think NFS mounted storage once they read RWX. But there is another way that we can support RWX, and that is the almost forgotten about RAWBLOCK mode. In RAWBLOCK mode you get a DISK device surfaced up into the container with an option to enable it for Multi-Writer (RWX). This seemed pretty useless for most “modern” use cases; why would you have multiple containers write to the same disk (instead of a file system which is a more general use case for RWX) at the same time? A quorum disk might be a use case here, but in reality I’ve not seen this use case at customers. But when we project the RAWBLOCK idea onto a VM running in KubeVirt, everything comes together!
“KubeVirt is the first really good use case I’ve seen for RAWBLOCK PVCs”
A VM would actually PREFER to have a direct disk mounted, as it needs a block device to boot and an NFS volume would simply add a layer where you need to convert blocks into segments inside a file which can (and probably will) hinder performance. Now add the ability for RAWBLOCK to support RWX and you have a direct block device into a KubeVirt-managed VM that is live migration capable!
From a Portworx perspective, we support RAWBLOCK in two ways: The original “ShareV4” way where we spawn a tiny NFS instance next to a block volume to enable multio-writer to it, and a true RWX RAWBLOCK implementation that supports both Red Hat OpenShift Virtualization and Suse Virtualization.
The “need” to run KubeVirt on Bare Metal
Will KubeVirt run on a k8s cluster that runs on for example VMware vSphere? Yes it will, but… Would that make sense? For a lab environment yes, for production… Not so much: You’d simply run the workload on the underlying hypervisor and not stack two hypervisors on top of each other. If you still want to travel this path, you’d have to enable “Expose hardware assisted virtualization to the guest OS” at the CPU level (assuming VMware vSphere) to get this working as you’d be nesting virtualization:

The more “normal” way to run KubeVirt VMs is on bare metal. This would enable the VMs to run directly on the underlying bare-metal installed Linux OS, so performance (and architecture) would be similar to a KVM Hypervisor approach. Add RWX-capable K8s storage to the setup and you can solidly run your VMs!
Running k8s on bare metal does add complexity though… Many customers have been running platforms like Red Hat OpenShift on a hypervisor just to avoid this added complexity of managing bare metal servers.