This patch adds a _PXM method to ACPI CPU objects for the pc machine. The _PXM value is derived from the passed in guest info, same way as CPU SRAT entries.
The motivation for this patch is a CPU hot-unplug/hot-plug bug observed when using a 3.11 linux guest kernel on a multi-NUMA node qemu/kvm VM. The linux guest kernel parses the SRAT CPU entries at boot time and stores them in the array __apicid_to_node. When a CPU is hot-removed, the linux guest kernel resets the removed CPU's __apicid_to_node entry to NO_NUMA_NODE (kernel commit c4c60524). When the removed cpu is hot-added again, the linux kernel looks up the hot-added cpu object's _PXM method instead of somehow re-discovering the SRAT entry info. With current qemu/seabios, the _PXM method is not found, and the CPU is thus hot-plugged in the default NUMA node 0. (The problem does not show up on initial hotplug of a cpu; the PXM method is still not found in this case, but the kernel still has the correct proximity value from the CPU's SRAT entry stored in __apicid_to_node)
ACPI spec mentions that the _PXM method is the correct way to determine proximity information at hot-add time. So far, qemu/seabios do not provide this method for CPUs. So regardless of kernel behaviour, it is a good idea to add this _PXM method. Since ACPI table generation has recently been moved from seabios to qemu, we do this in qemu.
Note that the above hot-remove/hot-add scenario has been tested on an older qemu + non-upstreamed patches for cpu hot-removal support, and not on qemu master (since cpu-del support is still not on master). The only testing done with qemu/seabios master and this patch, are successful boots of multi-node linux and windows8 guests.
For the initial discussion on seabios and linux-acpi lists see http://www.spinics.net/lists/linux-acpi/msg47058.html
Signed-off-by: Vasilis Liaskovitis vasilis.liaskovitis@profitbricks.com Reviewed-by: Thilo Fromm t-lo@thilo-fromm.de --- hw/i386/acpi-build.c | 2 ++ hw/i386/ssdt-proc.dsl | 2 ++ 2 files changed, 4 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 6cfa044..9373f5e 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -603,6 +603,7 @@ static inline char acpi_get_hex(uint32_t val) #define ACPI_PROC_OFFSET_CPUHEX (*ssdt_proc_name - *ssdt_proc_start + 2) #define ACPI_PROC_OFFSET_CPUID1 (*ssdt_proc_name - *ssdt_proc_start + 4) #define ACPI_PROC_OFFSET_CPUID2 (*ssdt_proc_id - *ssdt_proc_start) +#define ACPI_PROC_OFFSET_CPUPXM (*ssdt_proc_pxm - *ssdt_proc_start) #define ACPI_PROC_SIZEOF (*ssdt_proc_end - *ssdt_proc_start) #define ACPI_PROC_AML (ssdp_proc_aml + *ssdt_proc_start)
@@ -724,6 +725,7 @@ build_ssdt(GArray *table_data, GArray *linker, proc[ACPI_PROC_OFFSET_CPUHEX+1] = acpi_get_hex(i); proc[ACPI_PROC_OFFSET_CPUID1] = i; proc[ACPI_PROC_OFFSET_CPUID2] = i; + proc[ACPI_PROC_OFFSET_CPUPXM] = guest_info->node_cpu[i]; }
/* build this code: diff --git a/hw/i386/ssdt-proc.dsl b/hw/i386/ssdt-proc.dsl index 8229bfd..7eef8b2 100644 --- a/hw/i386/ssdt-proc.dsl +++ b/hw/i386/ssdt-proc.dsl @@ -47,6 +47,8 @@ DefinitionBlock ("ssdt-proc.aml", "SSDT", 0x01, "BXPC", "BXSSDT", 0x1) * also updating the C code. */ Name(_HID, "ACPI0007") + ACPI_EXTRACT_NAME_BYTE_CONST ssdt_proc_pxm + Name(_PXM, 0xAA) External(CPMA, MethodObj) External(CPST, MethodObj) External(CPEJ, MethodObj)