[SeaBIOS] [Qemu-devel] Fwd: [RFC PATCH 0/2] Allow RedHat PCI bridges reserve more buses than necessary during init
Marcel Apfelbaum
marcel at redhat.com
Wed Jul 19 21:37:23 CEST 2017
On 19/07/2017 21:56, Konrad Rzeszutek Wilk wrote:
> On Wed, Jul 19, 2017 at 09:38:50PM +0300, Alexander Bezzubikov wrote:
>> 2017-07-19 21:18 GMT+03:00 Konrad Rzeszutek Wilk <konrad.wilk at oracle.com>:
>>
>>> On Wed, Jul 19, 2017 at 05:14:41PM +0000, Alexander Bezzubikov wrote:
>>>> ср, 19 июля 2017 г. в 16:57, Konrad Rzeszutek Wilk <
>>> konrad.wilk at oracle.com>:
>>>>
>>>>> On Wed, Jul 19, 2017 at 04:20:12PM +0300, Aleksandr Bezzubikov wrote:
>>>>>> Now PCI bridges (and PCIE root port too) get a bus range number in
>>>>> system init,
>>>>>> basing on currently plugged devices. That's why when one wants to
>>>>> hotplug another bridge,
>>>>>> it needs his child bus, which the parent is unable to provide.
>>>>>
>>>>> Could you explain how you trigger this?
>>>>
>>>>
>>>> I'm trying to hot plug pcie-pci bridge into pcie root port, and Linux
>>> says
>>>> 'cannot allocate bus number for device bla-bla'. This obviously does not
>>>> allow me to use the bridge at all.
>>>>
>>>>>
>>>>>
>>>>>> The suggested workaround is to have vendor-specific capability in
>>> RedHat
>>>>> generic pcie-root-port
>>>>>> that contains number of additional bus to reserve on BIOS PCI init.
>>>>>
>>>>> But wouldn't the proper fix be for the PCI bridge to have the
>>> subordinate
>>>>> value be extended to fit more bus ranges?
>>>>
>>>>
>>>> What do you mean? This is what I'm trying to do. Do you suppose to get
>>> rid
>>>> of vendor-specific cap and use original register value instead of it?
>>>
>>> I would suggest a simple fix - each bridge has a a number of bus devices
>>> it can use. You have up to 255 - so you split the number of northbridge
>>> numbers by the amount of NUMA nodes (if that is used) - so for example
>>> if you have 4 NUMA nodes, each bridge would cover 63 bus numbers.
>>>
>>> Meaning the root bridge would cover 0->63 bus, 64->128, and so on.
>>> That gives you enough space to plug in your plugged in devices
>>> (up to 63).
>>>
>>> And if you need sub-briges then carve out a specific range.
>>>
>>
Hi Konrad,
>> The problem is that we don't know at the init moment how many subbridges we
>> may need,
>
Is possible the explanation was not clear clear and led to
some miscommunication.
>
> And the explanation above does not either. It just setups at init time
> an range where you can plug in your new devices in. But in a more uniform
> way such that you can also utilize this with NUMA and _PXM topology
> in the future.
>
I fully agree with you and actually QEMU has already implemented the
exact idea you are describing here, its called a pxb/pxb-pci device,
that can be "bounded" to a specific NUMA node and has a subrange of bus
numbers dedicated to it.
However this problem is different. In a PCI Express machine you
can hotplug PCIe devices only into PCIe Root Ports (or switch
downstream ports, but not in current scope).
We want to be able to hotplug a PCIe-PCI bridge into a PCIe Root Port
so we can then hot-plug legacy PCI devices.
Since the PCIe Root Port is a type of PCI bridge, at boot time
it only gets the bus sub-range (primary bus,subordinate bus]
which is computed by firmware and leaves no bus number that
can be used by a hot-plugged pci-bridge. And this obviously
does not depend on how we arrange NUMA/proximities.
We are also not looking for a fix for a specific guest OS,
so reserving some extra bus-numbers it has minimal impact
on the system. I do agree the problem may be solved differently,
however we can't reach all guest OS vendors and ask them to
support an alternative solution in a reasonable time frame.
Thanks,
Marcel
>> and how deep the whole device tree will be. The key moment - PCI bridge
>> hotplugging
>> needs either rescan all buses on each bridge device addition, or reserve
>> space in advance during BIOS init.
>
> can all buses on each bridge device addition, or reserve
>
> It is more complex than that - you may need to move devices that are
> below you. And Linux kernel (nor any other OS) can handle that.
> (They can during bootup)
>
>> In this series the second way was chosen.
>>
>>
>>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> Aleksandr Bezzubikov (2):
>>>>>> pci: add support for direct usage of bdf for capability lookup
>>>>>> pci: enable RedHat pci bridges to reserve more buses
>>>>>>
>>>>>> src/fw/pciinit.c | 12 ++++++++++--
>>>>>> src/hw/pcidevice.c | 24 ++++++++++++++++++++++++
>>>>>> src/hw/pcidevice.h | 1 +
>>>>>> 3 files changed, 35 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> --
>>>>>> 2.7.4
>>>>>>
>>>>>>
>>>>>
>>>> --
>>>> Alexander Bezzubikov
>>>
>>
>>
>>
>> --
>> Alexander Bezzubikov
More information about the SeaBIOS
mailing list