Overview

The Linux kernel contains a variety of code for running as a fully enlightened guest on Microsoft's Hyper-V hypervisor. Hyper-V consists primarily of a bare-metal hypervisor plus a virtual machine management service running in the parent partition (roughly equivalent to KVM and QEMU, for example). Guest VMs run in child partitions. In this documentation, references to Hyper-V usually encompass both the hypervisor and the VMM service without making a distinction about which functionality is provided by which component.

Hyper-V runs on x86/x64 and arm64 architectures, and Linux guests are supported on both. The functionality and behavior of Hyper-V is generally the same on both architectures unless noted otherwise.

Linux Guest Communication with Hyper-V

Linux guests communicate with Hyper-V in four different ways:

  • Implicit traps: As defined by the x86/x64 or arm64 architecture, some guest actions trap to Hyper-V. Hyper-V emulates the action and returns control to the guest. This behavior is generally invisible to the Linux kernel.

  • Explicit hypercalls: Linux makes an explicit function call to Hyper-V, passing parameters. Hyper-V performs the requested action and returns control to the caller. Parameters are passed in processor registers or in memory shared between the Linux guest and Hyper-V. On x86/x64, hypercalls use a Hyper-V specific calling sequence. On arm64, hypercalls use the ARM standard SMCCC calling sequence.

  • Synthetic register access: Hyper-V implements a variety of synthetic registers. On x86/x64 these registers appear as MSRs in the guest, and the Linux kernel can read or write these MSRs using the normal mechanisms defined by the x86/x64 architecture. On arm64, these synthetic registers must be accessed using explicit hypercalls.

  • VMbus: VMbus is a higher-level software construct that is built on the other 3 mechanisms. It is a message passing interface between the Hyper-V host and the Linux guest. It uses memory that is shared between Hyper-V and the guest, along with various signaling mechanisms.

The first three communication mechanisms are documented in the Hyper-V Top Level Functional Spec (TLFS). The TLFS describes general Hyper-V functionality and provides details on the hypercalls and synthetic registers. The TLFS is currently written for the x86/x64 architecture only.

VMbus is not documented. This documentation provides a high-level overview of VMbus and how it works, but the details can be discerned only from the code.

Sharing Memory

Many aspects are communication between Hyper-V and Linux are based on sharing memory. Such sharing is generally accomplished as follows:

  • Linux allocates memory from its physical address space using standard Linux mechanisms.

  • Linux tells Hyper-V the guest physical address (GPA) of the allocated memory. Many shared areas are kept to 1 page so that a single GPA is sufficient. Larger shared areas require a list of GPAs, which usually do not need to be contiguous in the guest physical address space. How Hyper-V is told about the GPA or list of GPAs varies. In some cases, a single GPA is written to a synthetic register. In other cases, a GPA or list of GPAs is sent in a VMbus message.

  • Hyper-V translates the GPAs into "real" physical memory addresses, and creates a virtual mapping that it can use to access the memory.

  • Linux can later revoke sharing it has previously established by telling Hyper-V to set the shared GPA to zero.

Hyper-V operates with a page size of 4 Kbytes. GPAs communicated to Hyper-V may be in the form of page numbers, and always describe a range of 4 Kbytes. Since the Linux guest page size on x86/x64 is also 4 Kbytes, the mapping from guest page to Hyper-V page is 1-to-1. On arm64, Hyper-V supports guests with 4/16/64 Kbyte pages as defined by the arm64 architecture. If Linux is using 16 or 64 Kbyte pages, Linux code must be careful to communicate with Hyper-V only in terms of 4 Kbyte pages. HV_HYP_PAGE_SIZE and related macros are used in code that communicates with Hyper-V so that it works correctly in all configurations.

As described in the TLFS, a few memory pages shared between Hyper-V and the Linux guest are "overlay" pages. With overlay pages, Linux uses the usual approach of allocating guest memory and telling Hyper-V the GPA of the allocated memory. But Hyper-V then replaces that physical memory page with a page it has allocated, and the original physical memory page is no longer accessible in the guest VM. Linux may access the memory normally as if it were the memory that it originally allocated. The "overlay" behavior is visible only because the contents of the page (as seen by Linux) change at the time that Linux originally establishes the sharing and the overlay page is inserted. Similarly, the contents change if Linux revokes the sharing, in which case Hyper-V removes the overlay page, and the guest page originally allocated by Linux becomes visible again.

Before Linux does a kexec to a kdump kernel or any other kernel, memory shared with Hyper-V should be revoked. Hyper-V could modify a shared page or remove an overlay page after the new kernel is using the page for a different purpose, corrupting the new kernel. Hyper-V does not provide a single "set everything" operation to guest VMs, so Linux code must individually revoke all sharing before doing kexec. See hv_kexec_handler() and hv_crash_handler(). But the crash/panic path still has holes in cleanup because some shared pages are set using per-CPU synthetic registers and there's no mechanism to revoke the shared pages for CPUs other than the CPU running the panic path.

CPU Management

Hyper-V does not have a ability to hot-add or hot-remove a CPU from a running VM. However, Windows Server 2019 Hyper-V and earlier versions may provide guests with ACPI tables that indicate more CPUs than are actually present in the VM. As is normal, Linux treats these additional CPUs as potential hot-add CPUs, and reports them as such even though Hyper-V will never actually hot-add them. Starting in Windows Server 2022 Hyper-V, the ACPI tables reflect only the CPUs actually present in the VM, so Linux does not report any hot-add CPUs.

A Linux guest CPU may be taken offline using the normal Linux mechanisms, provided no VMbus channel interrupts are assigned to the CPU. See the section on VMbus Interrupts for more details on how VMbus channel interrupts can be re-assigned to permit taking a CPU offline.

32-bit and 64-bit

On x86/x64, Hyper-V supports 32-bit and 64-bit guests, and Linux will build and run in either version. While the 32-bit version is expected to work, it is used rarely and may suffer from undetected regressions.

On arm64, Hyper-V supports only 64-bit guests.

Endian-ness

All communication between Hyper-V and guest VMs uses Little-Endian format on both x86/x64 and arm64. Big-endian format on arm64 is not supported by Hyper-V, and Linux code does not use endian-ness macros when accessing data shared with Hyper-V.

Versioning

Current Linux kernels operate correctly with older versions of Hyper-V back to Windows Server 2012 Hyper-V. Support for running on the original Hyper-V release in Windows Server 2008/2008 R2 has been removed.

A Linux guest on Hyper-V outputs in dmesg the version of Hyper-V it is running on. This version is in the form of a Windows build number and is for display purposes only. Linux code does not test this version number at runtime to determine available features and functionality. Hyper-V indicates feature/function availability via flags in synthetic MSRs that Hyper-V provides to the guest, and the guest code tests these flags.

VMbus has its own protocol version that is negotiated during the initial VMbus connection from the guest to Hyper-V. This version number is also output to dmesg during boot. This version number is checked in a few places in the code to determine if specific functionality is present.

Furthermore, each synthetic device on VMbus also has a protocol version that is separate from the VMbus protocol version. Device drivers for these synthetic devices typically negotiate the device protocol version, and may test that protocol version to determine if specific device functionality is present.

Code Packaging

Hyper-V related code appears in the Linux kernel code tree in three main areas:

  1. drivers/hv

  2. arch/x86/hyperv and arch/arm64/hyperv

  3. individual device driver areas such as drivers/scsi, drivers/net, drivers/clocksource, etc.

A few miscellaneous files appear elsewhere. See the full list under "Hyper-V/Azure CORE AND DRIVERS" and "DRM DRIVER FOR HYPERV SYNTHETIC VIDEO DEVICE" in the MAINTAINERS file.

The code in #1 and #2 is built only when CONFIG_HYPERV is set. Similarly, the code for most Hyper-V related drivers is built only when CONFIG_HYPERV is set.

Most Hyper-V related code in #1 and #3 can be built as a module. The architecture specific code in #2 must be built-in. Also, drivers/hv/hv_common.c is low-level code that is common across architectures and must be built-in.