This is v4 of the ACPI memory hotplug functionality. Only x86_64 target is supported (both i440fx and q35). There are still several issues, but it's been a while since v3 and I wanted to get some more feedback on the current state of the patchseries.
Overview:
Dimm device layout is modeled with a normal qemu device:
"-device dimm,id=name,size=sz,node=pxm,populated=on|off,bus=membus.0"
The starting physical address for all dimms is calculated from top of memory, during memory controller init, skipping the pci hole at [PCI_HOLE_START, 4G). e.g. "-device dimm,id=dimm0,size=512M,node=0,populated=off,bus=membus.0" will define a 512M memory dimm belonging to numa node 0, on bus membus.0.
Because dimm layout needs to be configured on machine-boot, all dimm devices need to be specified on startup command line (either with populated=on or with populated=off). The dimm information is stored in dimm configuration structures.
After machine startup, dimms are hot-added or removed with normal device_add and device_del operations e.g.: Hot-add syntax: "device_add dimm,id=mydimm0,bus=membus.0" Hot-remove syntax: "device_del dimm,id=mydimm0"
Changes v3->v4
- Dimms added with normal -device argument (extra -dimm arg dropped). - multiple memory buses can be registered. Memory buses of the real hw/chipset or a paravirtual memory bus can be added. - acpi implementation uses memory API instead of old ioports. - Support for q35/ich9 added (still buggy, see patch 12/31). - piix4/i440fx initialization code has been refactored to resemble q35. This will allow memory map initialization at chipset qdev init time for both machines, as well as more similar code. - Hot-remove functionality has been moved to separate patches. Hot-remove no longer frees memory but unmaps the dimm/qdev device from the guest's view. Freeing the memory should happen when the last user unrefs/unmaps the memory, see also (work in progress): https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg00728.html https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html - new qmp/hmp command for the state of each dimm (on/off)
Changes v2->v3
- qdev integration. Dimms are attached to a dimmbus. The dimmbus is a child of i440fx device in the pc machine. Hot-add and remove are done with normal device_add / device_del operations on the dimmbus. New commands "dimm_add" and "dimm_del" are obsolete. - Add _PS3 method to allow OSPM-induced hot operations. - pci-window calculation in Seabios takes dimms into account(for both 32-bit and 64-bit windows) - rename new qmp commands: query-memory-total and query-memory-hotplug - balloon driver can see the hotplugged memory
Changes v1->v2
- memory map is automatically calculated for hotplug dimms. Dimms are added from top-of-memory skipping the pci hole at [PCI_HOLE_START, 4G). - Renamed from "-memslot" to "-dimm". Commands changed to "dimm_add", "dimm_del" - Seabios ejection array reduced to a byte. Use extraction macros for dimm ssdt. - additional SRAT paravirt info does not break previous SRAT fw_cfg layout. - Documentation of new acpi_piix4 registers and paravirt data. - add ACPI _OST support for _OST enabled guests. This allows qemu to receive notification for success / failure of memory hot-add and hot-remove operations. Guest needs to support _OST (https://lkml.org/lkml/2012/6/25/321) - add monitor info command to report total guest memory (initial + hot-added)
Issues:
- hot-remove needs to only unmap the dimm device from guest's view. Freeing the memory should happen when the last user of the device (e.g. virtio-blk) unrefs the device. A testcase is needed for this.
- Live Migration: Ramblocks are migrated before qdev VMStates are migrated. So the DimmDevice is handled diferrently than other devices. Should this be reworked ?( DimmDevice structure currently does not define a VMStateDescription) Live migration works as long as the dimm layout (command line args) are identical at the source and destination qemu command line, and destination takes into account hot-operations that have occured on source. (v3 patch 10/19 created the DimmDevice that corresponds to an unknown incoming ramblock, e.g. for a dimm that was hot-added on source. but has been dropped for the moment).
- A main blocker issue is windows guest functionality. The patchset does not work for windows currently. Testing on win2012 server RC or windows2008 consumer prerelease, when adding a DIMM, there is a BSOD with ACPI_BIOS_ERROR message. After this, the VM keeps rebooting with ACPI_BIOS_ERROR. The windows pnpmem driver obviosuly has a problem with the seabios dimm implementation (or the seabios dimm implementation is not fully ACPI-compliant). If someone can review the seabios patches or has any ideas to debug this, let me know.
- hot-operation notification lists need to be added to migration state.
series is based on: - qemu master (commit a8a826a3) + patch: https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02699.html - seabios master (commit a810e4e7)
Can also be found at:
http://github.com/vliaskov/qemu-kvm/commits/memhp-v4 http://github.com/vliaskov/seabios/commits/memhp-v4
Vasilis Liaskovitis (21): qapi: make visit_type_size fallback to type_int Add SIZE type to qdev properties qemu-option: export parse_option_number Implement dimm device abstraction vl: handle "-device dimm" acpi_piix4 : Implement memory device hotplug registers acpi_ich9 : Implement memory device hotplug registers piix_pci and pc_piix: refactor piix_pci: Add i440fx dram controller initialization q35: Add i440fx dram controller initialization pc: Add dimm paravirt SRAT info Introduce paravirt interface QEMU_CFG_PCI_WINDOW Implement "info memory-total" and "query-memory-total" balloon: update with hotplugged memory Implement dimm-info dimm: add hot-remove capability acpi_piix4: add hot-remove capability acpi_ich9: add hot-remove capability Implement qmp and hmp commands for notification lists Add _OST dimm support Implement _PS3 for dimm
docs/specs/acpi_hotplug.txt | 54 ++++++ docs/specs/fwcfg.txt | 28 +++ hmp-commands.hx | 6 + hmp.c | 41 ++++ hmp.h | 3 + hw/Makefile.objs | 2 +- hw/acpi.h | 5 + hw/acpi_ich9.c | 115 +++++++++++- hw/acpi_ich9.h | 12 +- hw/acpi_piix4.c | 126 ++++++++++++- hw/dimm.c | 444 +++++++++++++++++++++++++++++++++++++++++++ hw/dimm.h | 102 ++++++++++ hw/fw_cfg.h | 1 + hw/lpc_ich9.c | 2 +- hw/pc.c | 28 +++- hw/pc.h | 1 + hw/pc_piix.c | 74 ++++++-- hw/pc_q35.c | 18 ++- hw/piix_pci.c | 249 ++++++++----------------- hw/q35.c | 27 +++ hw/q35.h | 5 + hw/qdev-properties.c | 60 ++++++ hw/qdev-properties.h | 3 + hw/virtio-balloon.c | 13 +- monitor.c | 21 ++ qapi-schema.json | 63 ++++++ qapi/qapi-visit-core.c | 11 +- qemu-option.c | 4 +- qemu-option.h | 4 + qmp-commands.hx | 57 ++++++ sysemu.h | 1 + vl.c | 60 ++++++ 32 files changed, 1432 insertions(+), 208 deletions(-) create mode 100644 docs/specs/acpi_hotplug.txt create mode 100644 docs/specs/fwcfg.txt create mode 100644 hw/dimm.c create mode 100644 hw/dimm.h
Vasilis Liaskovitis (9): Add ACPI_EXTRACT_DEVICE* macros Add SSDT memory device support acpi-dsdt: Implement functions for memory hotplug acpi: generate hotplug memory devices q35: Add memory hotplug handler pci: Use paravirt interface for pcimem_start and pcimem64_start acpi: add _EJ0 operation and eject port for memory devices Add _OST dimm method Implement _PS3 method for memory device
Makefile | 2 +- src/acpi-dsdt-mem-hotplug.dsl | 136 +++++++++++++++++++++++++++++++++++ src/acpi-dsdt.dsl | 5 +- src/acpi.c | 158 +++++++++++++++++++++++++++++++++++++++-- src/paravirt.c | 6 ++ src/paravirt.h | 2 + src/pciinit.c | 9 +++ src/q35-acpi-dsdt.dsl | 6 +- src/ssdt-mem.dsl | 73 +++++++++++++++++++ tools/acpi_extract.py | 28 +++++++ 10 files changed, 415 insertions(+), 10 deletions(-) create mode 100644 src/acpi-dsdt-mem-hotplug.dsl create mode 100644 src/ssdt-mem.dsl