On Wed, 27 Nov 2013 14:02:43 +0100 Vasilis Liaskovitis vasilis.liaskovitis@profitbricks.com wrote:
This patch adds a _PXM method to ACPI CPU objects for the pc machine. The _PXM value is derived from the passed in guest info, same way as CPU SRAT entries.
Currently, CPU SRAT entries are only enabled for cpus that are already present in the system. The SRAT entries for hotpluggable processors are disabled (flags bit 0 set to 0 in hw/i385/acpi-build.c:build_srat). Section 5.2.16.1 of ACPI spec mentions "If the Local APIC ID of a dynamically added processor is not present in the SRAT, a _PXM object must exist for the processor’s device or one of its ancestors in the ACPI Namespace." Since SRAT entries are not available for the hot-pluggable processors, a _PXM method must exist for them. Otherwise, the CPU is hot-added in the wrong NUMA node (default node 0).
Even if CPU SRAT entries are enabled, _PXM method is what the linux kernel consults on hot-add time. Section 17.2.1 of ACPI spec mentions " OSPM will consume the SRAT only at boot time. OSPM should use _PXM for any devices that are hot-added into the system after boot up." To be more precise if SRAT information is available to the guest kernel, it is used. However, parsed SRAT info is reset and lost after hot-remove operations, see kernel commit c4c60524. This means that on a hot-unplug / hot-replug scenario, and without a _PXM method, the kernel may put a CPU on different nodes because SRAT info has been reset by a previous hot-remove operation.
The above hot-remove/hot-add scenario has been tested on master, plus cpu-del patches from: https://lists.gnu.org/archive/html/qemu-devel/2013-10/msg01085.html With the curret _PXM patch, hot-added CPUs are always placed into the correct NUMA node, regardless of kernel behaviour.
v1->v2: Make method return a DWORD integer Tested on qemu master + cpu-del patches
Signed-off-by: Vasilis Liaskovitis vasilis.liaskovitis@profitbricks.com Reviewed-by: Thilo Fromm t-lo@thilo-fromm.de
Patch looks good, Please add patch to update hw/i386/ssdt-proc.hex.generated for hosts without iasl for completness
hw/i386/acpi-build.c | 5 +++++ hw/i386/ssdt-proc.dsl | 5 +++++ 2 files changed, 10 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index d089e1e..3c11ddc 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -603,6 +603,7 @@ static inline char acpi_get_hex(uint32_t val) #define ACPI_PROC_OFFSET_CPUHEX (*ssdt_proc_name - *ssdt_proc_start + 2) #define ACPI_PROC_OFFSET_CPUID1 (*ssdt_proc_name - *ssdt_proc_start + 4) #define ACPI_PROC_OFFSET_CPUID2 (*ssdt_proc_id - *ssdt_proc_start) +#define ACPI_PROC_OFFSET_CPUPXM (*ssdt_proc_pxm - *ssdt_proc_start) #define ACPI_PROC_SIZEOF (*ssdt_proc_end - *ssdt_proc_start) #define ACPI_PROC_AML (ssdp_proc_aml + *ssdt_proc_start)
@@ -724,6 +725,10 @@ build_ssdt(GArray *table_data, GArray *linker, proc[ACPI_PROC_OFFSET_CPUHEX+1] = acpi_get_hex(i); proc[ACPI_PROC_OFFSET_CPUID1] = i; proc[ACPI_PROC_OFFSET_CPUID2] = i;
proc[ACPI_PROC_OFFSET_CPUPXM] = guest_info->node_cpu[i];
proc[ACPI_PROC_OFFSET_CPUPXM + 1] = 0;
proc[ACPI_PROC_OFFSET_CPUPXM + 2] = 0;
proc[ACPI_PROC_OFFSET_CPUPXM + 3] = 0; } /* build this code:
diff --git a/hw/i386/ssdt-proc.dsl b/hw/i386/ssdt-proc.dsl index 8229bfd..8d4c5bf 100644 --- a/hw/i386/ssdt-proc.dsl +++ b/hw/i386/ssdt-proc.dsl @@ -47,6 +47,8 @@ DefinitionBlock ("ssdt-proc.aml", "SSDT", 0x01, "BXPC", "BXSSDT", 0x1)
- also updating the C code.
*/ Name(_HID, "ACPI0007")
ACPI_EXTRACT_NAME_DWORD_CONST ssdt_proc_pxm
Name(PXM, 0xAAAAAAAA) External(CPMA, MethodObj) External(CPST, MethodObj) External(CPEJ, MethodObj)
@@ -59,5 +61,8 @@ DefinitionBlock ("ssdt-proc.aml", "SSDT", 0x01, "BXPC", "BXSSDT", 0x1) Method(_EJ0, 1, NotSerialized) { CPEJ(ID, Arg0) }
Method(_PXM, 0) {
Return (PXM)
}}
}