From: Arbel Moshe arbel.moshe@oracle.com
This reverts commit 3c3a3fa6522f ("mptable: Don't describe pci-to-pci bridges.”)
The reverted commit removed the description of non-root PCI busses from the MPTable in claim they are not necessary.
However, it seems that some guests rely on this information in order to correclty configure IOAPIC redirection-table entries for interrupts originated from PCI devices on non-root PCI busses.
One such guest is "Extreme Networks OS". We observed that if this guest is setup with an E1000 NIC behind a PCI bridge, the OS wouldn't configure the IOAPIC redirection-table entry for the IRQ that the E1000 PIN is connected to. Therefore, interrupts from the NIC were not delivered to OS which caused guest to not have network connectivity.
Fixes: 3c3a3fa6522f ("mptable: Don't describe pci-to-pci bridges.”)
Reviewed-by: Liran Alon liran.alon@oracle.com Reviewed-by: Mark Kanda mark.kanda@oracle.com Signed-off-by: Arbel Moshe arbel.moshe@oracle.com --- src/fw/mptable.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/src/fw/mptable.c b/src/fw/mptable.c index 47385cc5d32d..27686369e98c 100644 --- a/src/fw/mptable.c +++ b/src/fw/mptable.c @@ -72,25 +72,30 @@ mptable_setup(void) } int entrycount = cpu - cpus;
- // PCI bus + // PCI buses struct mpt_bus *buses = (void*)cpu, *bus = buses; - if (!hlist_empty(&PCIDevices)) { + int lastbus = -1; + struct pci_device *pci; + foreachpci(pci) { + int curbus = pci_bdf_to_bus(pci->bdf); + if (curbus == lastbus) + continue; + lastbus = curbus; memset(bus, 0, sizeof(*bus)); bus->type = MPT_TYPE_BUS; - bus->busid = 0; + bus->busid = curbus; memcpy(bus->bustype, "PCI ", sizeof(bus->bustype)); bus++; - entrycount++; }
/* isa bus */ - int isabusid = bus - buses; + int isabusid; memset(bus, 0, sizeof(*bus)); bus->type = MPT_TYPE_BUS; - bus->busid = isabusid; + isabusid = bus->busid = lastbus + 1; memcpy(bus->bustype, "ISA ", sizeof(bus->bustype)); bus++; - entrycount++; + entrycount += bus - buses;
/* ioapic */ u8 ioapic_id = BUILD_IOAPIC_ID; @@ -108,11 +113,8 @@ mptable_setup(void) int dev = -1; unsigned short pinmask = 0;
- struct pci_device *pci; foreachpci(pci) { u16 bdf = pci->bdf; - if (pci_bdf_to_bus(bdf) != 0) - break; int pin = pci_config_readb(bdf, PCI_INTERRUPT_PIN); int irq = pci_config_readb(bdf, PCI_INTERRUPT_LINE); if (pin == 0)
On Wed, Nov 28, 2018 at 06:49:51PM +0200, Liran Alon wrote:
From: Arbel Moshe arbel.moshe@oracle.com
This reverts commit 3c3a3fa6522f ("mptable: Don't describe pci-to-pci bridges.”)
The reverted commit removed the description of non-root PCI busses from the MPTable in claim they are not necessary.
However, it seems that some guests rely on this information in order to correclty configure IOAPIC redirection-table entries for interrupts originated from PCI devices on non-root PCI busses.
One such guest is "Extreme Networks OS". We observed that if this guest is setup with an E1000 NIC behind a PCI bridge, the OS wouldn't configure the IOAPIC redirection-table entry for the IRQ that the E1000 PIN is connected to. Therefore, interrupts from the NIC were not delivered to OS which caused guest to not have network connectivity.
At the top of mptable.c is:
// MPTable generation (on emulators) // DO NOT ADD NEW FEATURES HERE. (See paravirt.c / biostables.c instead.)
These tables are considered static in SeaBIOS. We've found that any change to these legacy tables tends to break something. We moved the generation of SMBIOS and ACPI to QEMU for this reason. I don't think MPTable was moved into QEMU, but I think that should be the way forward if a particular guest needs different content in this table.
-Kevin
On 30 Nov 2018, at 21:25, Kevin O'Connor kevin@koconnor.net wrote:
On Wed, Nov 28, 2018 at 06:49:51PM +0200, Liran Alon wrote:
From: Arbel Moshe arbel.moshe@oracle.com
This reverts commit 3c3a3fa6522f ("mptable: Don't describe pci-to-pci bridges.”)
The reverted commit removed the description of non-root PCI busses from the MPTable in claim they are not necessary.
However, it seems that some guests rely on this information in order to correclty configure IOAPIC redirection-table entries for interrupts originated from PCI devices on non-root PCI busses.
One such guest is "Extreme Networks OS". We observed that if this guest is setup with an E1000 NIC behind a PCI bridge, the OS wouldn't configure the IOAPIC redirection-table entry for the IRQ that the E1000 PIN is connected to. Therefore, interrupts from the NIC were not delivered to OS which caused guest to not have network connectivity.
At the top of mptable.c is:
// MPTable generation (on emulators) // DO NOT ADD NEW FEATURES HERE. (See paravirt.c / biostables.c instead.)
These tables are considered static in SeaBIOS. We've found that any change to these legacy tables tends to break something. We moved the generation of SMBIOS and ACPI to QEMU for this reason. I don't think MPTable was moved into QEMU, but I think that should be the way forward if a particular guest needs different content in this table.
-Kevin
Makes sense.
Until that MPTable will be propagated to SeaBIOS via fw_cfg from QEMU, don’t you think it makes sense to revert this commit for now? As it caused a real guest workload to break (with non-trivial issue to diagnose) and doesn’t provide a real value of it’s own (or am I missing something?).
-Liran
On Fri, Nov 30, 2018 at 09:53:05PM +0200, Liran Alon wrote:
On 30 Nov 2018, at 21:25, Kevin O'Connor kevin@koconnor.net wrote: These tables are considered static in SeaBIOS. We've found that any change to these legacy tables tends to break something. We moved the generation of SMBIOS and ACPI to QEMU for this reason. I don't think MPTable was moved into QEMU, but I think that should be the way forward if a particular guest needs different content in this table.
Makes sense.
Until that MPTable will be propagated to SeaBIOS via fw_cfg from QEMU, don’t you think it makes sense to revert this commit for now? As it caused a real guest workload to break (with non-trivial issue to diagnose) and doesn’t provide a real value of it’s own (or am I missing something?).
Hi Liran,
The original commit is from almost six years ago. I know it's very difficult to track down these types of regressions. I don't doubt it took a large effort to find the root cause. However, there's a very real risk that a change (any change) would introduce a further regression.
Even when the change is correct, we know from experience that different operating systems sometimes take unexpected actions when parsing these tables. In particular, we've seen in the past that innocuous changes to the bios tables can cause some operating systems to redo "hardware scans". For these (and other) reasons we've moved bios table generation out of SeaBIOS.
So, I don't think it would be a good idea to make this change to SeaBIOS.
Thanks, -Kevin
On 2 Dec 2018, at 18:31, Kevin O'Connor kevin@koconnor.net wrote:
On Fri, Nov 30, 2018 at 09:53:05PM +0200, Liran Alon wrote:
On 30 Nov 2018, at 21:25, Kevin O'Connor kevin@koconnor.net wrote: These tables are considered static in SeaBIOS. We've found that any change to these legacy tables tends to break something. We moved the generation of SMBIOS and ACPI to QEMU for this reason. I don't think MPTable was moved into QEMU, but I think that should be the way forward if a particular guest needs different content in this table.
Makes sense.
Until that MPTable will be propagated to SeaBIOS via fw_cfg from QEMU, don’t you think it makes sense to revert this commit for now? As it caused a real guest workload to break (with non-trivial issue to diagnose) and doesn’t provide a real value of it’s own (or am I missing something?).
Hi Liran,
The original commit is from almost six years ago. I know it's very difficult to track down these types of regressions. I don't doubt it took a large effort to find the root cause. However, there's a very real risk that a change (any change) would introduce a further regression.
Even when the change is correct, we know from experience that different operating systems sometimes take unexpected actions when parsing these tables. In particular, we've seen in the past that innocuous changes to the bios tables can cause some operating systems to redo "hardware scans". For these (and other) reasons we've moved bios table generation out of SeaBIOS.
So, I don't think it would be a good idea to make this change to SeaBIOS.
Thanks, -Kevin
Hmm OK. I understand your point. BTW, we have run a big variety of OSes with this change and haven’t seen a regression.
So you recommend that even if we implement the mechanism to pass mptable via fw_cfg from QEMU that this specific behaviour will be controlled by a flag? That is doable. Just wondering on your opinion.
-Liran
On Sun, Dec 02, 2018 at 07:09:31PM +0200, Liran Alon wrote:
So you recommend that even if we implement the mechanism to pass mptable via fw_cfg from QEMU that this specific behaviour will be controlled by a flag? That is doable. Just wondering on your opinion.
If I understand it correctly, the general idea is to enable new tables only on the most recent qemu machine type. (So, old qemu machine types get the existing tables.)
-Kevin