This is v3 of the ACPI memory hotplug functionality. Only x86_64 target is supported for now.
Overview:
Dimm device layout is modeled with a new qemu command line
"-dimm id=name,size=sz,node=pxm,populated=on|off"
The starting physical address for all dimms is calculated automatically from top of memory, skipping the pci hole at [PCI_HOLE_START, 4G). Node is defining numa proximity for this dimm. When not defined it defaults to zero. "-dimm id=dimm0,size=512M,node=0,populated=off" will define a 512M memory slot belonging to numa node 0.
Dimms are added or removed with normal device_add, device_del operations: Hot-add syntax: "device_add dimm,id=mydimm0" Hot-remove syntax: "dimm_del dimm,id=mydimm0"
Changes v2->v3
- qdev integration. Dimms are attached to a dimmbus. The dimmbus is a child of i440fx device in the pc machine. Hot-add and hot-remove are done with normal device_add / device_del operations on the dimmbus. New commands "dimm_add" and "dimm_del" are obsolete. (In previous versions, dimms were always present on the qdev tree, and dimm_add/del simply meant allocating or deallocating memory for the devices. This version actually does hot-operations on the qdev tree) - Add _PS3 method to allow OSPM-induced hot operations. - pci-window calculation in Seabios takes dimms into account(for both 32-bit and 64-bit windows) - rename new qmp commands: query-memory-total and query-memory-hotplug - balloon driver can see the hotplugged memory
Changes v1->v2
- memory map is automatically calculated for hotplug dimms. Dimms are added from top-of-memory skipping the pci hole at [PCI_HOLE_START, 4G). - Renamed from "-memslot" to "-dimm". Commands changed to "dimm_add", "dimm_del". - Seabios ejection array reduced to a byte. Use extraction macros for dimm ssdt. - additional SRAT paravirt info does not break previous SRAT fw_cfg layout. - Documentation of new acpi_piix4 registers and paravirt data. - add ACPI _OST support for _OST enabled guests. This allows qemu to receive notification for success / failure of memory hot-add and hot-remove operations. Guest needs to support _OST (https://lkml.org/lkml/2012/6/25/321) - add monitor info command to report total guest memory (initial + hot-added) - add command line options and monitor commands for batch dimm creation/population (obsolete from v3 onwards)
Issues:
- A main blocker issue is windows guest functionality. The patchset does not work for windows currently. My guess is the windows pnpmem driver does not like the seabios dimm device implementation (or the seabios dimm implementation is not fully ACPI-compliant). If someone can review the seabios patches or has any ideas to debug this, let me know.
Testing on win2012 server RC or windows2008 consumer prerelease. When adding a DIMM, the device shows up in DeviceManager but does not work. Relevant messages:
" This device cannot start. (Code 10) Device configured(memory.inf) (UserPnP eventID 400) Device installed (memory.inf) ACPI/PNP0C80\2&daba3ff&1 was configured Device not started(PNPMEM) (Kernel-PnP eventID 411, kernelID) Device ACPI\PNP0C80\2&daba3ff&1 had a problem starting Driver Name: memory.inf (c:\Windows\system32\DRIVERS\pnpmem.sys 6.2.8400 winmain_win8rc)) Memory range:0x80000000 - 0x90000000 (Initial memory of VM is 2GB. The hotplugged DIMM was a 256GB with physical address range starting at 2GB ) Conflicting device list: No conflicts. "
Adding a 2nd or more dimms causes a crash (PNP_DETECTED_FATAL_ERROR with blue screen of death) and makes windows reboot. After this, the VM keeps rebooting with ACPI_BIOS_ERROR. The VM refuses to boot anymore once a 2nd (or more) extra dimm is plugged-in.
- Is the dimmbus the correct way to go about integrating into qdev/qom? In a v1 comment, Anthony mentioned attaching dimms directly to an i440fx device as children. Is this possible without a bus?
- Live migration works as long as the dimm layout (-dimm command line args) are identical at the source and destination qemu command line. Patch 10/19 creates the DimmDevice that corresponds to the unknown incoming ramblock. Ramblocks are migrated before qdev VMStates are migrated (the DimmDevice structure currently does not define a VMStateDescription). So the DimmDevice is handled diferrently than other devices. If this is not acceptable, any suggestions on how should it be reworked?
- Hot-operation notification lists need to be added to migration state.
Please review. Could people state which other issues they consider blocker for including this upstream?
Does this patchset need to wait for 1.4 or could this be considered for 1.3 (assuming blockers are resolved)? The patchset has been revised every few months, but I will provide quicker version updates onwards. I can also bring this up on a weekly meeting agenda if needed.
series is based on uq/master for qemu-kvm, and master for seabios. Can be found also at:
http://github.com/vliaskov/qemu-kvm/commits/memhp-v3 http://github.com/vliaskov/seabios/commits/memhp-v3
Vasilis Liaskovitis (12): Implement dimm device abstraction Implement "-dimm" command line option acpi_piix4: Implement memory device hotplug registers pc: calculate dimm physical addresses and adjust memory map pc: Add dimm paravirt SRAT info fix live-migration when "populated=on" is missing Implement qmp and hmp commands for notification lists Implement "info memory-total" and "query-memory-total" balloon: update with hotplugged memory Add _OST dimm support Update dimm state on reset Implement _PS3 for dimm
arch_init.c | 24 ++- docs/specs/acpi_hotplug.txt | 54 ++++++ docs/specs/fwcfg.txt | 28 +++ hmp-commands.hx | 4 + hmp.c | 24 +++ hmp.h | 2 + hw/Makefile.objs | 2 +- hw/acpi_piix4.c | 114 +++++++++++- hw/dimm.c | 435 +++++++++++++++++++++++++++++++++++++++++++ hw/dimm.h | 101 ++++++++++ hw/pc.c | 55 ++++++- hw/pc.h | 6 + hw/pc_piix.c | 20 ++- hw/virtio-balloon.c | 13 +- monitor.c | 14 ++ qapi-schema.json | 37 ++++ qemu-config.c | 25 +++ qemu-options.hx | 5 + qmp-commands.hx | 57 ++++++ sysemu.h | 1 + vl.c | 51 +++++ 21 files changed, 1051 insertions(+), 21 deletions(-) create mode 100644 docs/specs/acpi_hotplug.txt create mode 100644 docs/specs/fwcfg.txt create mode 100644 hw/dimm.c create mode 100644 hw/dimm.h
Vasilis Liaskovitis (7): Add ACPI_EXTRACT_DEVICE* macros Subject: [PATCH 02/18] Add SSDT memory device support acpi-dsdt: Implement functions for memory hotplug acpi: generate hotplug memory devices Add _OST dimm method Implement _PS3 method for memory device Calculate pcimem_start and pcimem64_start from SRAT entries
Makefile | 2 +- src/acpi-dsdt.dsl | 135 ++++++++++++++++++++++++++++++- src/acpi.c | 216 ++++++++++++++++++++++++++++++++++++++++++++---- src/acpi.h | 3 + src/pciinit.c | 6 +- src/post.c | 3 + src/smp.c | 4 + src/ssdt-mem.dsl | 73 +++++++++++++++++ tools/acpi_extract.py | 28 +++++++ 9 files changed, 447 insertions(+), 23 deletions(-) create mode 100644 src/ssdt-mem.dsl