Hi,
Recently, we tried some hotplug issues. The case is: when hotplug a device (e.g. iGPU) onto pci-bridge after guest booting up, guest reports "BAR 2: no space for [mem size 0x40000000 64bit pref]" etc.
Seabios checks all the devices under the pci-bridge when qemu launching the guest, and only allocates "size=ALIGN(sum, align)" of memory space for pci-bridge mem and pref-mem windows. So if we hotplug a big pci device like Intel GPU which needs 256M mem/pref-mem or bigger, it will fail.
If my understanding is right, we may need some other logic of the memory allocation in seabios?
Looking forward to the advice.
Thanks, Jing
On 07/11/18 05:12, Liu, Jing2 wrote:
Hi,
Recently, we tried some hotplug issues. The case is: when hotplug a device (e.g. iGPU) onto pci-bridge after guest booting up, guest reports "BAR 2: no space for [mem size 0x40000000 64bit pref]" etc.
Seabios checks all the devices under the pci-bridge when qemu launching the guest, and only allocates "size=ALIGN(sum, align)" of memory space for pci-bridge mem and pref-mem windows. So if we hotplug a big pci device like Intel GPU which needs 256M mem/pref-mem or bigger, it will fail.
If my understanding is right, we may need some other logic of the memory allocation in seabios?
Looking forward to the advice.
I suggest using the Q35 machine type, and hot-plugging the device into a PCI Express Root Port ("pcie-root-port"). The latter has properties dedicated to reserving various PCI resources specifically for hotplug purposes:
pcie-root-port.mem-reserve=size pcie-root-port.pref32-reserve=size pcie-root-port.bus-reserve=uint32 pcie-root-port.pref64-reserve=size pcie-root-port.io-reserve=size
In order to address the issue you report at the top, you would use
-device pcie-root-port,bus=pcie.0,id=root-port-XXX,pref64-reserve=1G
then hot-plug the device into "root-port-XXX".
Please see the following two files in the QEMU tree: - docs/pcie.txt - docs/pcie_pci_bridge.txt
The first provides guidelines on the PCI Express hierarchy in general, and also on hotplug in particular (see section 5).
The second is relevant here because it describes the vendor capability that QEMU and SeaBIOS (and OVMF) use, for passing the resource reservation hints from QEMU to guest firmware.
The 2nd document mainly focuses on hot-plugging a PCIE-PCI bridge into a PCIE root port, and on reserving a bus number range for the sub-hierarchy behind said hot-plugged PCIE-PCI bridge. However, the reservation mechanism is the same for other types of PCI resources, and for other types of child devices that are hot-plugged into root ports.
HTH, Laszlo
Yep, thanks for the advice. But hotplugging on pci-bridge is the actual use case request so we would better solve and fix this.
On 7/11/2018 6:38 PM, Laszlo Ersek wrote:
On 07/11/18 05:12, Liu, Jing2 wrote:
Hi,
Recently, we tried some hotplug issues. The case is: when hotplug a device (e.g. iGPU) onto pci-bridge after guest booting up, guest reports "BAR 2: no space for [mem size 0x40000000 64bit pref]" etc.
Seabios checks all the devices under the pci-bridge when qemu launching the guest, and only allocates "size=ALIGN(sum, align)" of memory space for pci-bridge mem and pref-mem windows. So if we hotplug a big pci device like Intel GPU which needs 256M mem/pref-mem or bigger, it will fail.
If my understanding is right, we may need some other logic of the memory allocation in seabios?
Looking forward to the advice.
I suggest using the Q35 machine type, and hot-plugging the device into a PCI Express Root Port ("pcie-root-port"). The latter has properties dedicated to reserving various PCI resources specifically for hotplug purposes:
pcie-root-port.mem-reserve=size pcie-root-port.pref32-reserve=size pcie-root-port.bus-reserve=uint32 pcie-root-port.pref64-reserve=size pcie-root-port.io-reserve=size
In order to address the issue you report at the top, you would use
-device pcie-root-port,bus=pcie.0,id=root-port-XXX,pref64-reserve=1G
then hot-plug the device into "root-port-XXX".
Please see the following two files in the QEMU tree:
- docs/pcie.txt
- docs/pcie_pci_bridge.txt
The first provides guidelines on the PCI Express hierarchy in general, and also on hotplug in particular (see section 5).
The second is relevant here because it describes the vendor capability that QEMU and SeaBIOS (and OVMF) use, for passing the resource reservation hints from QEMU to guest firmware.
The 2nd document mainly focuses on hot-plugging a PCIE-PCI bridge into a PCIE root port, and on reserving a bus number range for the sub-hierarchy behind said hot-plugged PCIE-PCI bridge. However, the reservation mechanism is the same for other types of PCI resources, and for other types of child devices that are hot-plugged into root ports.
HTH, Laszlo
On 7/12/2018 1:43 PM, Liu, Jing2 wrote:
Yep, thanks for the advice. But hotplugging on pci-bridge is the actual use case request so we would better solve and fix this.
The details are -machine pc and -device pci-bridge,bus=pci.0,id=pci-bridge-0
Thanks, Jing
On 7/11/2018 6:38 PM, Laszlo Ersek wrote:
On 07/11/18 05:12, Liu, Jing2 wrote:
Hi,
Recently, we tried some hotplug issues. The case is: when hotplug a device (e.g. iGPU) onto pci-bridge after guest booting up, guest reports "BAR 2: no space for [mem size 0x40000000 64bit pref]" etc.
Seabios checks all the devices under the pci-bridge when qemu launching the guest, and only allocates "size=ALIGN(sum, align)" of memory space for pci-bridge mem and pref-mem windows. So if we hotplug a big pci device like Intel GPU which needs 256M mem/pref-mem or bigger, it will fail.
If my understanding is right, we may need some other logic of the memory allocation in seabios?
Looking forward to the advice.
I suggest using the Q35 machine type, and hot-plugging the device into a PCI Express Root Port ("pcie-root-port"). The latter has properties dedicated to reserving various PCI resources specifically for hotplug purposes:
pcie-root-port.mem-reserve=size pcie-root-port.pref32-reserve=size pcie-root-port.bus-reserve=uint32 pcie-root-port.pref64-reserve=size pcie-root-port.io-reserve=size
In order to address the issue you report at the top, you would use
-device pcie-root-port,bus=pcie.0,id=root-port-XXX,pref64-reserve=1G
then hot-plug the device into "root-port-XXX".
Please see the following two files in the QEMU tree:
- docs/pcie.txt
- docs/pcie_pci_bridge.txt
The first provides guidelines on the PCI Express hierarchy in general, and also on hotplug in particular (see section 5).
The second is relevant here because it describes the vendor capability that QEMU and SeaBIOS (and OVMF) use, for passing the resource reservation hints from QEMU to guest firmware.
The 2nd document mainly focuses on hot-plugging a PCIE-PCI bridge into a PCIE root port, and on reserving a bus number range for the sub-hierarchy behind said hot-plugged PCIE-PCI bridge. However, the reservation mechanism is the same for other types of PCI resources, and for other types of child devices that are hot-plugged into root ports.
HTH, Laszlo
SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios
On 07/12/18 07:43, Liu, Jing2 wrote:
Yep, thanks for the advice. But hotplugging on pci-bridge is the actual use case request so we would better solve and fix this.
It doesn't take a bugfix, but a feature. The firmware needs to be told to reserve PCI resources in advance, in preparation for the hotplug. The firmware might reserve some amount of resources by default, but those defaults cannot be arbitrarily large. The reservation sizes need to come from QEMU.
The most robust method for this was deemed to be placing a vendor capability in the hotplug controller's PCI config space. In QEMU this is implemented with the pci_bridge_qemu_reserve_cap_init() function. Right now, the function is only called from the PCI Express Root Port device model, in "hw/pci-bridge/gen_pcie_root_port.c".
You can cold-plug a PCI Express Root Port in the Q35 root complex (pcie.0), reserving the MMIO resources you want, cold-plug a PCIE-PCI Bridge in that root port, and then hot-plug the desired endpoint into that PCIE-PCI Bridge. This is one of the exact examples that "docs/pcie_pci_bridge.txt" provides. (The other example is when the PCIE-PCI bridge itself is hot-plugged into the root port, for which bus number reservation is necessary too, at the root port level.)
If you want more than that, e.g. do something similar on i440fx, that will take QEMU work as well, not just SeaBIOS.
Laszlo
On 7/11/2018 6:38 PM, Laszlo Ersek wrote:
On 07/11/18 05:12, Liu, Jing2 wrote:
Hi,
Recently, we tried some hotplug issues. The case is: when hotplug a device (e.g. iGPU) onto pci-bridge after guest booting up, guest reports "BAR 2: no space for [mem size 0x40000000 64bit pref]" etc.
Seabios checks all the devices under the pci-bridge when qemu launching the guest, and only allocates "size=ALIGN(sum, align)" of memory space for pci-bridge mem and pref-mem windows. So if we hotplug a big pci device like Intel GPU which needs 256M mem/pref-mem or bigger, it will fail.
If my understanding is right, we may need some other logic of the memory allocation in seabios?
Looking forward to the advice.
I suggest using the Q35 machine type, and hot-plugging the device into a PCI Express Root Port ("pcie-root-port"). The latter has properties dedicated to reserving various PCI resources specifically for hotplug purposes:
pcie-root-port.mem-reserve=size pcie-root-port.pref32-reserve=size pcie-root-port.bus-reserve=uint32 pcie-root-port.pref64-reserve=size pcie-root-port.io-reserve=size
In order to address the issue you report at the top, you would use
-device pcie-root-port,bus=pcie.0,id=root-port-XXX,pref64-reserve=1G
then hot-plug the device into "root-port-XXX".
Please see the following two files in the QEMU tree:
- docs/pcie.txt
- docs/pcie_pci_bridge.txt
The first provides guidelines on the PCI Express hierarchy in general, and also on hotplug in particular (see section 5).
The second is relevant here because it describes the vendor capability that QEMU and SeaBIOS (and OVMF) use, for passing the resource reservation hints from QEMU to guest firmware.
The 2nd document mainly focuses on hot-plugging a PCIE-PCI bridge into a PCIE root port, and on reserving a bus number range for the sub-hierarchy behind said hot-plugged PCIE-PCI bridge. However, the reservation mechanism is the same for other types of PCI resources, and for other types of child devices that are hot-plugged into root ports.
HTH, Laszlo
Hi Laszlo,
On 7/12/2018 3:29 PM, Laszlo Ersek wrote:
On 07/12/18 07:43, Liu, Jing2 wrote:
Yep, thanks for the advice. But hotplugging on pci-bridge is the actual use case request so we would better solve and fix this.
You can cold-plug a PCI Express Root Port in the Q35 root complex (pcie.0), reserving the MMIO resources you want, cold-plug a PCIE-PCI Bridge in that root port, and then hot-plug the desired endpoint into that PCIE-PCI Bridge.
I'm trying this. But actual results show that, when pcie-pci-bridge has no coldplug device, it shows all NONE for each windows. 01:00.0 PCI bridge: Red Hat, Inc. Device 000e (prog-if 00 [Normal decode]) I/O behind bridge: None Memory behind bridge: None Prefetchable memory behind bridge: None Only if I cold plug some device (e.g. e1000) under it, and then hotplug another device might be successful.
BTW, I open the guest kernel config: CONFIG_PCI_REALLOC_ENABLE_AUTO=y, but it doesn't work. I'm not sure if there are some other issues I forgot?
Jing
This is one of the exact examples that
"docs/pcie_pci_bridge.txt" provides. (The other example is when the PCIE-PCI bridge itself is hot-plugged into the root port, for which bus number reservation is necessary too, at the root port level.)
If you want more than that, e.g. do something similar on i440fx, that will take QEMU work as well, not just SeaBIOS.
Laszlo
On 07/16/18 11:45, Liu, Jing2 wrote:
Hi Laszlo,
On 7/12/2018 3:29 PM, Laszlo Ersek wrote:
On 07/12/18 07:43, Liu, Jing2 wrote:
Yep, thanks for the advice. But hotplugging on pci-bridge is the actual use case request so we would better solve and fix this.
You can cold-plug a PCI Express Root Port in the Q35 root complex (pcie.0), reserving the MMIO resources you want, cold-plug a PCIE-PCI Bridge in that root port, and then hot-plug the desired endpoint into that PCIE-PCI Bridge.
I'm trying this. But actual results show that, when pcie-pci-bridge has no coldplug device, it shows all NONE for each windows. 01:00.0 PCI bridge: Red Hat, Inc. Device 000e (prog-if 00 [Normal decode]) I/O behind bridge: None Memory behind bridge: None Prefetchable memory behind bridge: None
Can you check /proc/iomem, and dmesg?
What is your exact QEMU command line?
BTW, there's also https://bugzilla.redhat.com/show_bug.cgi?id=1536147. It might be relevant here.
(I haven't personally tested SeaBIOS in the hotplug scenario at hand, but it's been my understanding that, as long as you cold-plug the PCIe-PCI bridge itself, RHBZ#1536147 shouldn't apply, and the hotplug into the bridge should just work. Personally I've only tested the same with OVMF only.)
Adding Marcel and Alexander to the thread (likely belatedly; sorry about that).
Thanks Laszlo
Only if I cold plug some device (e.g. e1000) under it, and then hotplug another device might be successful.
BTW, I open the guest kernel config: CONFIG_PCI_REALLOC_ENABLE_AUTO=y, but it doesn't work. I'm not sure if there are some other issues I forgot?
Jing
This is one of the exact examples that
"docs/pcie_pci_bridge.txt" provides. (The other example is when the PCIE-PCI bridge itself is hot-plugged into the root port, for which bus number reservation is necessary too, at the root port level.)
If you want more than that, e.g. do something similar on i440fx, that will take QEMU work as well, not just SeaBIOS.
Laszlo
On 7/17/2018 9:42 PM, Laszlo Ersek wrote:
On 07/16/18 11:45, Liu, Jing2 wrote:
Hi Laszlo,
On 7/12/2018 3:29 PM, Laszlo Ersek wrote:
On 07/12/18 07:43, Liu, Jing2 wrote:
Yep, thanks for the advice. But hotplugging on pci-bridge is the actual use case request so we would better solve and fix this.
You can cold-plug a PCI Express Root Port in the Q35 root complex (pcie.0), reserving the MMIO resources you want, cold-plug a PCIE-PCI Bridge in that root port, and then hot-plug the desired endpoint into that PCIE-PCI Bridge.
I'm trying this. But actual results show that, when pcie-pci-bridge has no coldplug device, it shows all NONE for each windows. 01:00.0 PCI bridge: Red Hat, Inc. Device 000e (prog-if 00 [Normal decode]) I/O behind bridge: None Memory behind bridge: None Prefetchable memory behind bridge: None
Can you check /proc/iomem, and dmesg?
What is your exact QEMU command line?
I tried again and realized it is my fault to forget the device driver in the guest kernel. After adding that driver, the hotplug is successful.
In conclusion, the legacy pci hotplug can be finished only if 1~3. 1. pcie-root-port reserves enough spaces 2. device drivers exist in guest 3. CONFIG_PCI_REALLOC_ENABLE_AUTO is not set. It is not about this issue. 4. When no device under pcie-pci-bridge, it will show us all NONE. But __pci_bridge_assign_resources (qemu codes) will dynamic assign the resouce sizes which are got by __pci_bus_size_bridges. SHPC is used.
Thanks very much for the help, Laszlo!
Jing
BTW, there's also https://bugzilla.redhat.com/show_bug.cgi?id=1536147. It might be relevant here.
(I haven't personally tested SeaBIOS in the hotplug scenario at hand, but it's been my understanding that, as long as you cold-plug the PCIe-PCI bridge itself, RHBZ#1536147 shouldn't apply, and the hotplug into the bridge should just work. Personally I've only tested the same with OVMF only.)
Adding Marcel and Alexander to the thread (likely belatedly; sorry about that).
Thanks Laszlo
Only if I cold plug some device (e.g. e1000) under it, and then hotplug another device might be successful.
BTW, I open the guest kernel config: CONFIG_PCI_REALLOC_ENABLE_AUTO=y, but it doesn't work. I'm not sure if there are some other issues I forgot?
Jing
This is one of the exact examples that
"docs/pcie_pci_bridge.txt" provides. (The other example is when the PCIE-PCI bridge itself is hot-plugged into the root port, for which bus number reservation is necessary too, at the root port level.)
If you want more than that, e.g. do something similar on i440fx, that will take QEMU work as well, not just SeaBIOS.
Laszlo