On 03/08/2017 5:41, Laine Stump wrote:
On 08/02/2017 01:58 PM, Marcel Apfelbaum wrote:
On 02/08/2017 19:26, Michael S. Tsirkin wrote:
On Wed, Aug 02, 2017 at 06:36:29PM +0300, Marcel Apfelbaum wrote:
>>> Can dmi-pci support shpc? why doesn't it? For compatibility? >> >> I don't know why, but the fact that it doesn't is the reason libvirt >> settled on auto-creating a dmi-pci bridge and a pci-pci bridge under >> that for Q35. The reasoning was (IIRC Laine's words correctly) >> that the >> dmi-pci bridge cannot receive hotplugged devices, while the pci-pci >> bridge cannot be connected to the root complex. So both were needed.
Hi Laine,
At least that's what I was told :-) (seriously, 50% of the convoluted rules encoded into libvirt's PCI bus topology construction and connection rules come from trial and error, and the other 50% come from advice and recommendations from others who (unlike me) actually know something about PCI.)
Of course the whole setup of plugging a pci-bridge into a dmi-to-pci-bridge was (at the time at least) an exercise in futility, since hotplug didn't work properly on pci-bridge+Q35 anyway (that initially wasn't explained to me; it was only after I had constructed the odd bus topology and it was in released code that someone told me "Oh, by the way, hotplug to pci-bridge doesn't work on Q35". At first it was described as a bug, then later reclassified as a future feature.)
(I guess the upside is that all of the horrible complex/confusing code needed to automatically add two controllers just to plug in a single endpoint is now already in the code, and will "just work" if/when needed).
Now that I go back to look at this thread (qemu-devel is just too much for me to try and read unless something has been Cc'ed to me - I really don't know how you guys manage it!), I see that pcie-pci-bridge has been implemented, and we (libvirt) will want to use that instead of dmi-to-pci-bridge when available. And pcie-pci-bridge itself can have endpoints hotplugged into it, correct?
Yes.
This means there will need to be patches for libvirt that check for the presence of pcie-pci-bridge, and if it's found they will replace any auto-added dmi-to-pci-bridge+pci-bridge with a long pcie-pci-bridge.
The PCIe-PCI bridge is to be plugged into a PCIe Root Port and then you can add PCI devices to it. The devices can be hot-plugged into it (see below the limitations) and even the bridge itself can be hot-plugged (old OSes might not support it).
So the device will replace the dmi-pci-bridge + pci-pci bridge completely.
libvirt will have 2 options: 1. Start with a pcie-pci bridge attached to a PCIe Root Port and all legacy PCI devices should land there (or on bus 0) (You can use the "auto" device addressing, add PCI devices automatically to this device until the bridge is full, then use the last slot to add a pci brigde or use another pcie-pci bridge) 2. Leave a PCIe Root Port empty and configure with hints for the fw that we might want to hotplug a pcie-pci bridge into it. If a PCI device is needed, hotplug the pcie-pci bridge first, then the device.
The above model gives you enough elasticity so if you: 1. don't need PCI devices -> create the machine with no pci controllers 2. need PCI devices -> add a pcie-pci bridge and you get a legacy PCI bus supporting hotplug. 3. might need PCI devices -> leave a PCIe Root Port empty (+ hints)
>> >> Thanks >> Laszlo > > OK. Is it true that dmi-pci + pci-pci under it will allow hotplug > on Q35 if we just flip the bit in _OSC?
Marcel, what say you?... :)
Good news, works with: -device i82801b11-bridge,id=b1 -device pci-bridge,id=b2,bus=b1,chassis_nr=1,msi=off
And presumably it works for modern windows? OK, so it looks like patch 1 is merely a bugfix, I'll merge it for 2.10.
Tested with Win10, I think is OK to merge if for 2.10.
Notice bridge's msi=off until the following kernel bug will be merged: https://www.spinics.net/lists/linux-pci/msg63052.html
Does libvirt support msi=off as a work-around?
We have no explicit setting for msi on pci controllers. The only place we explicitly set that is on the ivshmem device.
We need msi=off because of a bug in Linux Kernel. Even if the bug is fixed (there is already a patch upstream), we don't know when will get in (actually 4.14) and what versions will include it.
That doesn't mean that we couldn't add it. However, if we were going to do it manually, that would mean adding another knob that we have to support forever. And even if we wanted to do it automatically, we would not only need to find something in qemu to key off of when deciding whether or not to set it, but we would *still* have to explicitly store the setting in the config so that migrations between hosts using differing versions of qemu would preserve guest ABI.
It is not even something QEMU can be queried about. It depends on the guest OS.
Are there really enough people demanding (with actual concrete plans of *using*) hotplug of legacy PCI devices on Q35 guests *immediately* that we want to permanently pollute libvirt's code in this manner just for an interim workaround?
If/when Q35 would become the default machine, we want feature parity, so the users can keep the exact (almost) setup on q35. PCI hotplug is part of it.
I didn't have enough time/energy to fully parse all the rest of this thread - is msi=off currently required for pcie-pci-bridge hotplug as well?
Yes.
(not that it changes my opinion - just as we can tell people
"upgrade to a new qemu and libvirt if you want to hotplug legacy PCI devices on Q35 guests", we can also tell them "Oh, and wait X weeks and upgrade to a new kernel too".
I agree it will be hard to manage such a flag on libvirt automatically, but exposing an msi property to the pcie-pci-bridge and adding a comment: "switch to off if pci-hotplug doesn't work" would be ok?
An alternative is to not expose "msi" to libvirt and default it to off. In the future, if the feature proves valuable, we can ask libvirt to help for transition to "on".
Thanks, Marcel