On Tue, Dec 18, 2012 at 10:41 AM, Vasilis Liaskovitis
<vasilis.liaskovitis@profitbricks.com> wrote:
This is v4 of the ACPI memory hotplug functionality. Only x86_64 target is
supported (both i440fx and q35). There are still several issues, but it's
been a while since v3 and I wanted to get some more feedback on the current
state of the patchseries.
We are working in memory hotplug functionality on pSeries machine. I'm wondering whether and how we can better integrate things. Do you think the DIMM abstraction is generic enough to be used in other machine types?
Overview:
Dimm device layout is modeled with a normal qemu device:
"-device dimm,id=name,size=sz,node=pxm,populated=on|off,bus=membus.0"
How does this will handle the no-hotplugable memory for example the memory passed in '-m' parameter?
The starting physical address for all dimms is calculated from top of memory,
during memory controller init, skipping the pci hole at [PCI_HOLE_START, 4G).
e.g.
"-device dimm,id=dimm0,size=512M,node=0,populated=off,bus=membus.0"
will define a 512M memory dimm belonging to numa node 0, on bus membus.0.
Because dimm layout needs to be configured on machine-boot, all dimm devices
need to be specified on startup command line (either with populated=on or with
populated=off). The dimm information is stored in dimm configuration structures.
After machine startup, dimms are hot-added or removed with normal device_add
and device_del operations e.g.:
Hot-add syntax: "device_add dimm,id=mydimm0,bus=membus.0"
Hot-remove syntax: "device_del dimm,id=mydimm0"
Changes v3->v4
- Dimms added with normal -device argument (extra -dimm arg dropped).
- multiple memory buses can be registered. Memory buses of the real hw/chipset
or a paravirtual memory bus can be added.
- acpi implementation uses memory API instead of old ioports.
- Support for q35/ich9 added (still buggy, see patch 12/31).
- piix4/i440fx initialization code has been refactored to resemble q35. This
will allow memory map initialization at chipset qdev init time for both
machines, as well as more similar code.
- Hot-remove functionality has been moved to separate patches. Hot-remove no
longer frees memory but unmaps the dimm/qdev device from the guest's view.
Freeing the memory should happen when the last user unrefs/unmaps the memory,
see also (work in progress):
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg00728.html
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02697.html
- new qmp/hmp command for the state of each dimm (on/off)
Changes v2->v3
- qdev integration. Dimms are attached to a dimmbus. The dimmbus is a child
of i440fx device in the pc machine. Hot-add and remove are done with normal
device_add / device_del operations on the dimmbus. New commands "dimm_add" and
"dimm_del" are obsolete.
- Add _PS3 method to allow OSPM-induced hot operations.
- pci-window calculation in Seabios takes dimms into account(for both 32-bit and
64-bit windows)
- rename new qmp commands: query-memory-total and query-memory-hotplug
- balloon driver can see the hotplugged memory
Changes v1->v2
- memory map is automatically calculated for hotplug dimms. Dimms are added from
top-of-memory skipping the pci hole at [PCI_HOLE_START, 4G).
- Renamed from "-memslot" to "-dimm". Commands changed to "dimm_add", "dimm_del"
- Seabios ejection array reduced to a byte. Use extraction macros for dimm ssdt.
- additional SRAT paravirt info does not break previous SRAT fw_cfg layout.
- Documentation of new acpi_piix4 registers and paravirt data.
- add ACPI _OST support for _OST enabled guests. This allows qemu to receive
notification for success / failure of memory hot-add and hot-remove operations.
Guest needs to support _OST (https://lkml.org/lkml/2012/6/25/321)
- add monitor info command to report total guest memory (initial + hot-added)
Issues:
- hot-remove needs to only unmap the dimm device from guest's view. Freeing the
memory should happen when the last user of the device (e.g. virtio-blk) unrefs
the device. A testcase is needed for this.
- Live Migration: Ramblocks are migrated before qdev VMStates are migrated. So
the DimmDevice is handled diferrently than other devices. Should this be
reworked ?( DimmDevice structure currently does not define a VMStateDescription)
Live migration works as long as the dimm layout (command line args) are
identical at the source and destination qemu command line, and destination takes
into account hot-operations that have occured on source. (v3 patch 10/19
created the DimmDevice that corresponds to an unknown incoming ramblock, e.g.
for a dimm that was hot-added on source. but has been dropped for the moment).
- A main blocker issue is windows guest functionality. The patchset does not
work for windows currently. Testing on win2012 server RC or windows2008
consumer prerelease, when adding a DIMM, there is a BSOD with ACPI_BIOS_ERROR
message. After this, the VM keeps rebooting with ACPI_BIOS_ERROR. The windows
pnpmem driver obviosuly has a problem with the seabios dimm implementation
(or the seabios dimm implementation is not fully ACPI-compliant). If someone
can review the seabios patches or has any ideas to debug this, let me know.
- hot-operation notification lists need to be added to migration state.
series is based on:
- qemu master (commit a8a826a3) + patch:
https://lists.gnu.org/archive/html/qemu-devel/2012-11/msg02699.html
- seabios master (commit a810e4e7)
Can also be found at:
http://github.com/vliaskov/qemu-kvm/commits/memhp-v4
http://github.com/vliaskov/seabios/commits/memhp-v4
Vasilis Liaskovitis (21):
qapi: make visit_type_size fallback to type_int
Add SIZE type to qdev properties
qemu-option: export parse_option_number
Implement dimm device abstraction
vl: handle "-device dimm"
acpi_piix4 : Implement memory device hotplug registers
acpi_ich9 : Implement memory device hotplug registers
piix_pci and pc_piix: refactor
piix_pci: Add i440fx dram controller initialization
q35: Add i440fx dram controller initialization
pc: Add dimm paravirt SRAT info
Introduce paravirt interface QEMU_CFG_PCI_WINDOW
Implement "info memory-total" and "query-memory-total"
balloon: update with hotplugged memory
Implement dimm-info
dimm: add hot-remove capability
acpi_piix4: add hot-remove capability
acpi_ich9: add hot-remove capability
Implement qmp and hmp commands for notification lists
Add _OST dimm support
Implement _PS3 for dimm
docs/specs/acpi_hotplug.txt | 54 ++++++
docs/specs/fwcfg.txt | 28 +++
hmp-commands.hx | 6 +
hmp.c | 41 ++++
hmp.h | 3 +
hw/Makefile.objs | 2 +-
hw/acpi.h | 5 +
hw/acpi_ich9.c | 115 +++++++++++-
hw/acpi_ich9.h | 12 +-
hw/acpi_piix4.c | 126 ++++++++++++-
hw/dimm.c | 444 +++++++++++++++++++++++++++++++++++++++++++
hw/dimm.h | 102 ++++++++++
hw/fw_cfg.h | 1 +
hw/lpc_ich9.c | 2 +-
hw/pc.c | 28 +++-
hw/pc.h | 1 +
hw/pc_piix.c | 74 ++++++--
hw/pc_q35.c | 18 ++-
hw/piix_pci.c | 249 ++++++++-----------------
hw/q35.c | 27 +++
hw/q35.h | 5 +
hw/qdev-properties.c | 60 ++++++
hw/qdev-properties.h | 3 +
hw/virtio-balloon.c | 13 +-
monitor.c | 21 ++
qapi-schema.json | 63 ++++++
qapi/qapi-visit-core.c | 11 +-
qemu-option.c | 4 +-
qemu-option.h | 4 +
qmp-commands.hx | 57 ++++++
sysemu.h | 1 +
vl.c | 60 ++++++
32 files changed, 1432 insertions(+), 208 deletions(-)
create mode 100644 docs/specs/acpi_hotplug.txt
create mode 100644 docs/specs/fwcfg.txt
create mode 100644 hw/dimm.c
create mode 100644 hw/dimm.h
Vasilis Liaskovitis (9):
Add ACPI_EXTRACT_DEVICE* macros
Add SSDT memory device support
acpi-dsdt: Implement functions for memory hotplug
acpi: generate hotplug memory devices
q35: Add memory hotplug handler
pci: Use paravirt interface for pcimem_start and pcimem64_start
acpi: add _EJ0 operation and eject port for memory devices
Add _OST dimm method
Implement _PS3 method for memory device
Makefile | 2 +-
src/acpi-dsdt-mem-hotplug.dsl | 136 +++++++++++++++++++++++++++++++++++
src/acpi-dsdt.dsl | 5 +-
src/acpi.c | 158 +++++++++++++++++++++++++++++++++++++++--
src/paravirt.c | 6 ++
src/paravirt.h | 2 +
src/pciinit.c | 9 +++
src/q35-acpi-dsdt.dsl | 6 +-
src/ssdt-mem.dsl | 73 +++++++++++++++++++
tools/acpi_extract.py | 28 +++++++
10 files changed, 415 insertions(+), 10 deletions(-)
create mode 100644 src/acpi-dsdt-mem-hotplug.dsl
create mode 100644 src/ssdt-mem.dsl
--
1.7.9