On 19/07/2017 21:56, Konrad Rzeszutek Wilk wrote:
On Wed, Jul 19, 2017 at 09:38:50PM +0300, Alexander Bezzubikov wrote:
2017-07-19 21:18 GMT+03:00 Konrad Rzeszutek Wilk konrad.wilk@oracle.com:
On Wed, Jul 19, 2017 at 05:14:41PM +0000, Alexander Bezzubikov wrote:
ср, 19 июля 2017 г. в 16:57, Konrad Rzeszutek Wilk <
konrad.wilk@oracle.com>:
On Wed, Jul 19, 2017 at 04:20:12PM +0300, Aleksandr Bezzubikov wrote:
Now PCI bridges (and PCIE root port too) get a bus range number in
system init,
basing on currently plugged devices. That's why when one wants to
hotplug another bridge,
it needs his child bus, which the parent is unable to provide.
Could you explain how you trigger this?
I'm trying to hot plug pcie-pci bridge into pcie root port, and Linux
says
'cannot allocate bus number for device bla-bla'. This obviously does not allow me to use the bridge at all.
The suggested workaround is to have vendor-specific capability in
RedHat
generic pcie-root-port
that contains number of additional bus to reserve on BIOS PCI init.
But wouldn't the proper fix be for the PCI bridge to have the
subordinate
value be extended to fit more bus ranges?
What do you mean? This is what I'm trying to do. Do you suppose to get
rid
of vendor-specific cap and use original register value instead of it?
I would suggest a simple fix - each bridge has a a number of bus devices it can use. You have up to 255 - so you split the number of northbridge numbers by the amount of NUMA nodes (if that is used) - so for example if you have 4 NUMA nodes, each bridge would cover 63 bus numbers.
Meaning the root bridge would cover 0->63 bus, 64->128, and so on. That gives you enough space to plug in your plugged in devices (up to 63).
And if you need sub-briges then carve out a specific range.
Hi Konrad,
The problem is that we don't know at the init moment how many subbridges we may need,
Is possible the explanation was not clear clear and led to some miscommunication.
And the explanation above does not either. It just setups at init time an range where you can plug in your new devices in. But in a more uniform way such that you can also utilize this with NUMA and _PXM topology in the future.
I fully agree with you and actually QEMU has already implemented the exact idea you are describing here, its called a pxb/pxb-pci device, that can be "bounded" to a specific NUMA node and has a subrange of bus numbers dedicated to it.
However this problem is different. In a PCI Express machine you can hotplug PCIe devices only into PCIe Root Ports (or switch downstream ports, but not in current scope).
We want to be able to hotplug a PCIe-PCI bridge into a PCIe Root Port so we can then hot-plug legacy PCI devices.
Since the PCIe Root Port is a type of PCI bridge, at boot time it only gets the bus sub-range (primary bus,subordinate bus] which is computed by firmware and leaves no bus number that can be used by a hot-plugged pci-bridge. And this obviously does not depend on how we arrange NUMA/proximities.
We are also not looking for a fix for a specific guest OS, so reserving some extra bus-numbers it has minimal impact on the system. I do agree the problem may be solved differently, however we can't reach all guest OS vendors and ask them to support an alternative solution in a reasonable time frame.
Thanks, Marcel
and how deep the whole device tree will be. The key moment - PCI bridge hotplugging needs either rescan all buses on each bridge device addition, or reserve space in advance during BIOS init.
can all buses on each bridge device addition, or reserve
It is more complex than that - you may need to move devices that are below you. And Linux kernel (nor any other OS) can handle that. (They can during bootup)
In this series the second way was chosen.
Aleksandr Bezzubikov (2): pci: add support for direct usage of bdf for capability lookup pci: enable RedHat pci bridges to reserve more buses
src/fw/pciinit.c | 12 ++++++++++-- src/hw/pcidevice.c | 24 ++++++++++++++++++++++++ src/hw/pcidevice.h | 1 + 3 files changed, 35 insertions(+), 2 deletions(-)
-- 2.7.4
-- Alexander Bezzubikov
-- Alexander Bezzubikov