Currently seabios assumes there is only one pci domain(0), and almost everything operates on pci domain 0 by default. This patch aims to add multiple pci domain support for pci_device, while reserve the original API for compatibility.
The reason to get seabios involved is that the pxb-pcie host bus created in QEMU is now in a different PCI domain, and its bus number would start from 0 instead of bus_nr. Actually bus_nr should not be used when in another non-zero domain. However, QEMU only binds port 0xcf8 and 0xcfc to bus pcie.0. To avoid bus confliction, we should use other port pairs for busses under new domains.
Current issues: * when trying to read config space of pcie_pci_bridge, it actually reads out the result of mch. I'm working on this weird behavior.
Changelog: v2 <- v1: - Fix bugs in filtering domains when traversing pci devices - Reformat some hardcoded codes, such as probing the pci device in pci_setup
Zihan Yang (3): fw/pciinit: Recognize pxb-pcie-dev device pci_device: Add pci domain support pci: filter undesired domain when traversing pci
src/fw/coreboot.c | 2 +- src/fw/csm.c | 2 +- src/fw/mptable.c | 1 + src/fw/paravirt.c | 3 +- src/fw/pciinit.c | 276 ++++++++++++++++++++++++++++++--------------------- src/hw/ahci.c | 1 + src/hw/ata.c | 1 + src/hw/esp-scsi.c | 1 + src/hw/lsi-scsi.c | 1 + src/hw/megasas.c | 1 + src/hw/mpt-scsi.c | 1 + src/hw/nvme.c | 1 + src/hw/pci.c | 69 +++++++------ src/hw/pci.h | 42 +++++--- src/hw/pci_ids.h | 6 +- src/hw/pcidevice.c | 11 +- src/hw/pcidevice.h | 8 +- src/hw/pvscsi.c | 1 + src/hw/sdcard.c | 1 + src/hw/usb-ehci.c | 1 + src/hw/usb-ohci.c | 1 + src/hw/usb-uhci.c | 1 + src/hw/usb-xhci.c | 1 + src/hw/virtio-blk.c | 1 + src/hw/virtio-scsi.c | 1 + src/optionroms.c | 3 + 26 files changed, 268 insertions(+), 170 deletions(-)
QEMU q35 uses pxb-pcie-dev to enable multiple host bridges, this patch recognizes such devices in seabios and add corresponding e820 entry.
MCFG base and size are already setup in QEMU, we just need to read it
Signed-off-by: Zihan Yang whois.zihan.yang@gmail.com --- src/fw/paravirt.c | 1 - src/fw/pciinit.c | 17 +++++++++++++++++ src/hw/pci_ids.h | 1 + 3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c index 0770c47..6b14542 100644 --- a/src/fw/paravirt.c +++ b/src/fw/paravirt.c @@ -197,7 +197,6 @@ qemu_platform_setup(void) if (!loader_err) warn_internalerror(); } - acpi_setup(); }
diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 3a2f747..6e6a434 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -507,11 +507,28 @@ static void mch_mem_addr_setup(struct pci_device *dev, void *arg) pci_io_low_end = acpi_pm_base; }
+static void pxb_mem_addr_setup(struct pci_device *dev, void *arg) +{ + union u64_u32_u mcfg_base; + mcfg_base.lo = pci_config_readl(dev->bdf, Q35_HOST_BRIDGE_PCIEXBAR); + mcfg_base.hi = pci_config_readl(dev->bdf, Q35_HOST_BRIDGE_PCIEXBAR + 4); + + // Fix me! Use another meaningful macro + u32 mcfg_size = pci_config_readl(dev->bdf, Q35_HOST_BRIDGE_PCIEXBAR + 8); + + /* Skip config write here as the qemu-level objects are already setup, we + * read mcfg_base and mcfg_size from it just now. Instead, we directly add + * this item to e820 */ + e820_add(mcfg_base.val, mcfg_size, E820_RESERVED); +} + static const struct pci_device_id pci_platform_tbl[] = { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82441, i440fx_mem_addr_setup), PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_Q35_MCH, mch_mem_addr_setup), + PCI_DEVICE(PCI_VENDOR_ID_REDHAT, PCI_DEVICE_ID_REDHAT_PXB_HOST, + pxb_mem_addr_setup), PCI_DEVICE_END };
diff --git a/src/hw/pci_ids.h b/src/hw/pci_ids.h index 38fa2ca..35096ea 100644 --- a/src/hw/pci_ids.h +++ b/src/hw/pci_ids.h @@ -2265,6 +2265,7 @@
#define PCI_VENDOR_ID_REDHAT 0x1b36 #define PCI_DEVICE_ID_REDHAT_ROOT_PORT 0x000C +#define PCI_DEVICE_ID_REDHAT_PXB_HOST 0x000B
#define PCI_VENDOR_ID_TEKRAM 0x1de1 #define PCI_DEVICE_ID_TEKRAM_DC290 0xdc29
On 08/09/2018 08:43 AM, Zihan Yang wrote:
No need for this chunk.
That is indeed strange. If I got it right, this is the address of Q35 host bridge, we want other addresses here.
So did QEMU configure the address or only the size?
Will this recognize only the pxb-pcie bridge, or also the legacy pxb one? Because we plan to support only the pxb-pcie version.
Thanks, Marcel
Marcel Apfelbaum marcel.apfelbaum@gmail.com 于2018年8月25日周六 下午4:11写道:
I will try to find another address in next version.
QEMU configure both the address and the size. Since the base address is right behind the ram_over_4g and the size depends on the number of buses we want to use, they are not fixed anymore. So I just let qemu configures them both and seabios only reads them out. Maybe after we resolve the design issue, we can 'truly' let seabios to fully decide the base address and size.
The value is 0x000b, which is pxb-pcie in qemu. The corresponding macro is PCI_DEVICE_ID_REDHAT_PXB_PCIE. The value for PXB is 0x0009, and is defined with another macro PCI_DEVICE_ID_REDHAT_PXB.
Most part of seabios assume only PCI domain 0. This patch adds support for multiple domain in pci devices, which involves some API changes.
For compatibility, interfaces such as pci_config_read[b|w|l] still exist so that existing domain 0 devices needs no modification, but whenever a device wants to reside in different domain, it should add *_dom suffix to above functions, e.g, pci_config_readl_dom(..., domain_nr) to read from specific host bridge other than q35 host bridge. Also, the user should check the device domain when using foreachpci() macro to fileter undesired devices that reside in a different domain.
Signed-off-by: Zihan Yang whois.zihan.yang@gmail.com --- src/fw/coreboot.c | 2 +- src/fw/csm.c | 2 +- src/fw/paravirt.c | 2 +- src/fw/pciinit.c | 261 ++++++++++++++++++++++++++++++----------------------- src/hw/pci.c | 69 +++++++------- src/hw/pci.h | 42 ++++++--- src/hw/pci_ids.h | 7 +- src/hw/pcidevice.c | 8 +- src/hw/pcidevice.h | 4 +- 9 files changed, 227 insertions(+), 170 deletions(-)
diff --git a/src/fw/coreboot.c b/src/fw/coreboot.c index 7c0954b..c955dfd 100644 --- a/src/fw/coreboot.c +++ b/src/fw/coreboot.c @@ -254,7 +254,7 @@ coreboot_platform_setup(void) { if (!CONFIG_COREBOOT) return; - pci_probe_devices(); + pci_probe_devices(0);
struct cb_memory *cbm = CBMemTable; if (!cbm) diff --git a/src/fw/csm.c b/src/fw/csm.c index 03b4bb8..e94f614 100644 --- a/src/fw/csm.c +++ b/src/fw/csm.c @@ -63,7 +63,7 @@ static void csm_maininit(struct bregs *regs) { interface_init(); - pci_probe_devices(); + pci_probe_devices(0);
csm_compat_table.PnPInstallationCheckSegment = SEG_BIOS; csm_compat_table.PnPInstallationCheckOffset = get_pnp_offset(); diff --git a/src/fw/paravirt.c b/src/fw/paravirt.c index 6b14542..ef4d487 100644 --- a/src/fw/paravirt.c +++ b/src/fw/paravirt.c @@ -155,7 +155,7 @@ qemu_platform_setup(void) return;
if (runningOnXen()) { - pci_probe_devices(); + pci_probe_devices(0); xen_hypercall_setup(); xen_biostable_setup(); return; diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index 6e6a434..fcdcd38 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -51,6 +51,7 @@ u64 pcimem_end = BUILD_PCIMEM_END; u64 pcimem64_start = BUILD_PCIMEM64_START; u64 pcimem64_end = BUILD_PCIMEM64_END; u64 pci_io_low_end = 0xa000; +u64 pxb_mcfg_size = 0;
struct pci_region_entry { struct pci_device *dev; @@ -88,9 +89,9 @@ static void pci_set_io_region_addr(struct pci_device *pci, int bar, u64 addr, int is64) { u32 ofs = pci_bar(pci, bar); - pci_config_writel(pci->bdf, ofs, addr); + pci_config_writel_dom(pci->bdf, ofs, addr, pci->domain_nr); if (is64) - pci_config_writel(pci->bdf, ofs + 4, addr >> 32); + pci_config_writel_dom(pci->bdf, ofs + 4, addr >> 32, pci->domain_nr); }
@@ -405,25 +406,29 @@ static void pci_bios_init_device(struct pci_device *pci)
/* map the interrupt */ u16 bdf = pci->bdf; - int pin = pci_config_readb(bdf, PCI_INTERRUPT_PIN); + int pin = pci_config_readb_dom(bdf, PCI_INTERRUPT_PIN, pci->domain_nr); if (pin != 0) - pci_config_writeb(bdf, PCI_INTERRUPT_LINE, pci_slot_get_irq(pci, pin)); + pci_config_writeb_dom(bdf, PCI_INTERRUPT_LINE, pci_slot_get_irq(pci, pin), + pci->domain_nr);
pci_init_device(pci_device_tbl, pci, NULL);
/* enable memory mappings */ - pci_config_maskw(bdf, PCI_COMMAND, 0, - PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_SERR); + pci_config_maskw_dom(bdf, PCI_COMMAND, 0, + PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_SERR, + pci->domain_nr); /* enable SERR# for forwarding */ if (pci->header_type & PCI_HEADER_TYPE_BRIDGE) - pci_config_maskw(bdf, PCI_BRIDGE_CONTROL, 0, - PCI_BRIDGE_CTL_SERR); + pci_config_maskw_dom(bdf, PCI_BRIDGE_CONTROL, 0, + PCI_BRIDGE_CTL_SERR, pci->domain_nr); }
-static void pci_bios_init_devices(void) +static void pci_bios_init_devices(int domain_nr) { struct pci_device *pci; foreachpci(pci) { + if (pci->domain_nr != domain_nr) + continue; pci_bios_init_device(pci); } } @@ -520,6 +525,10 @@ static void pxb_mem_addr_setup(struct pci_device *dev, void *arg) * read mcfg_base and mcfg_size from it just now. Instead, we directly add * this item to e820 */ e820_add(mcfg_base.val, mcfg_size, E820_RESERVED); + + /* Add PXBHosts so that we can can initialize them later */ + ++PXBHosts; + pxb_mcfg_size += mcfg_size; }
static const struct pci_device_id pci_platform_tbl[] = { @@ -532,27 +541,31 @@ static const struct pci_device_id pci_platform_tbl[] = { PCI_DEVICE_END };
-static void pci_bios_init_platform(void) +static void pci_bios_init_platform(int domain_nr) { struct pci_device *pci; foreachpci(pci) { + if (pci->domain_nr != domain_nr) + continue; pci_init_device(pci_platform_tbl, pci, NULL); } }
-static u8 pci_find_resource_reserve_capability(u16 bdf) +static u8 pci_find_resource_reserve_capability(u16 bdf, int domain_nr) { - if (pci_config_readw(bdf, PCI_VENDOR_ID) == PCI_VENDOR_ID_REDHAT && - pci_config_readw(bdf, PCI_DEVICE_ID) == - PCI_DEVICE_ID_REDHAT_ROOT_PORT) { + if (pci_config_readw_dom(bdf, PCI_VENDOR_ID, domain_nr) == PCI_VENDOR_ID_REDHAT && + (pci_config_readw_dom(bdf, PCI_DEVICE_ID, domain_nr) == + PCI_DEVICE_ID_REDHAT_ROOT_PORT || + pci_config_readw_dom(bdf, PCI_DEVICE_ID, domain_nr) == + PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE)) { u8 cap = 0; do { - cap = pci_find_capability(bdf, PCI_CAP_ID_VNDR, cap); + cap = pci_find_capability_dom(bdf, PCI_CAP_ID_VNDR, cap, domain_nr); } while (cap && - pci_config_readb(bdf, cap + PCI_CAP_REDHAT_TYPE_OFFSET) != + pci_config_readb_dom(bdf, cap + PCI_CAP_REDHAT_TYPE_OFFSET, domain_nr) != REDHAT_CAP_RESOURCE_RESERVE); if (cap) { - u8 cap_len = pci_config_readb(bdf, cap + PCI_CAP_FLAGS); + u8 cap_len = pci_config_readb_dom(bdf, cap + PCI_CAP_FLAGS, domain_nr); if (cap_len < RES_RESERVE_CAP_SIZE) { dprintf(1, "PCI: QEMU resource reserve cap length %d is invalid\n", cap_len); @@ -570,7 +583,7 @@ static u8 pci_find_resource_reserve_capability(u16 bdf) ****************************************************************/
static void -pci_bios_init_bus_rec(int bus, u8 *pci_bus) +pci_bios_init_bus_rec(int bus, u8 *pci_bus, int domain_nr) { int bdf; u16 class; @@ -578,54 +591,54 @@ pci_bios_init_bus_rec(int bus, u8 *pci_bus) dprintf(1, "PCI: %s bus = 0x%x\n", __func__, bus);
/* prevent accidental access to unintended devices */ - foreachbdf(bdf, bus) { - class = pci_config_readw(bdf, PCI_CLASS_DEVICE); + foreachbdf_dom(bdf, bus, domain_nr) { + class = pci_config_readw_dom(bdf, PCI_CLASS_DEVICE, domain_nr); if (class == PCI_CLASS_BRIDGE_PCI) { - pci_config_writeb(bdf, PCI_SECONDARY_BUS, 255); - pci_config_writeb(bdf, PCI_SUBORDINATE_BUS, 0); + pci_config_writeb_dom(bdf, PCI_SECONDARY_BUS, 255, domain_nr); + pci_config_writeb_dom(bdf, PCI_SUBORDINATE_BUS, 0, domain_nr); } }
- foreachbdf(bdf, bus) { - class = pci_config_readw(bdf, PCI_CLASS_DEVICE); + foreachbdf_dom(bdf, bus, domain_nr) { + class = pci_config_readw_dom(bdf, PCI_CLASS_DEVICE, domain_nr); if (class != PCI_CLASS_BRIDGE_PCI) { continue; } dprintf(1, "PCI: %s bdf = 0x%x\n", __func__, bdf);
- u8 pribus = pci_config_readb(bdf, PCI_PRIMARY_BUS); + u8 pribus = pci_config_readb_dom(bdf, PCI_PRIMARY_BUS, domain_nr); if (pribus != bus) { dprintf(1, "PCI: primary bus = 0x%x -> 0x%x\n", pribus, bus); - pci_config_writeb(bdf, PCI_PRIMARY_BUS, bus); + pci_config_writeb_dom(bdf, PCI_PRIMARY_BUS, bus, domain_nr); } else { dprintf(1, "PCI: primary bus = 0x%x\n", pribus); }
- u8 secbus = pci_config_readb(bdf, PCI_SECONDARY_BUS); + u8 secbus = pci_config_readb_dom(bdf, PCI_SECONDARY_BUS, domain_nr); (*pci_bus)++; if (*pci_bus != secbus) { dprintf(1, "PCI: secondary bus = 0x%x -> 0x%x\n", secbus, *pci_bus); secbus = *pci_bus; - pci_config_writeb(bdf, PCI_SECONDARY_BUS, secbus); + pci_config_writeb_dom(bdf, PCI_SECONDARY_BUS, secbus, domain_nr); } else { dprintf(1, "PCI: secondary bus = 0x%x\n", secbus); }
/* set to max for access to all subordinate buses. later set it to accurate value */ - u8 subbus = pci_config_readb(bdf, PCI_SUBORDINATE_BUS); - pci_config_writeb(bdf, PCI_SUBORDINATE_BUS, 255); + u8 subbus = pci_config_readb_dom(bdf, PCI_SUBORDINATE_BUS, domain_nr); + pci_config_writeb_dom(bdf, PCI_SUBORDINATE_BUS, 255, domain_nr);
- pci_bios_init_bus_rec(secbus, pci_bus); + pci_bios_init_bus_rec(secbus, pci_bus, domain_nr);
if (subbus != *pci_bus) { u8 res_bus = *pci_bus; - u8 cap = pci_find_resource_reserve_capability(bdf); + u8 cap = pci_find_resource_reserve_capability(bdf, domain_nr);
if (cap) { - u32 tmp_res_bus = pci_config_readl(bdf, - cap + RES_RESERVE_BUS_RES); + u32 tmp_res_bus = pci_config_readl_dom(bdf, + cap + RES_RESERVE_BUS_RES, domain_nr); if (tmp_res_bus != (u32)-1) { res_bus = tmp_res_bus & 0xFF; if ((u8)(res_bus + secbus) < secbus || @@ -648,44 +661,43 @@ pci_bios_init_bus_rec(int bus, u8 *pci_bus) } else { dprintf(1, "PCI: subordinate bus = 0x%x\n", subbus); } - pci_config_writeb(bdf, PCI_SUBORDINATE_BUS, subbus); + pci_config_writeb_dom(bdf, PCI_SUBORDINATE_BUS, subbus, domain_nr); } }
static void -pci_bios_init_bus(void) +pci_bios_init_bus(int domain_nr) { - u8 extraroots = romfile_loadint("etc/extra-pci-roots", 0); u8 pci_bus = 0;
- pci_bios_init_bus_rec(0 /* host bus */, &pci_bus); + pci_bios_init_bus_rec(0 /* host bus */, &pci_bus, domain_nr);
- if (extraroots) { + if (domain_nr) { while (pci_bus < 0xff) { pci_bus++; - pci_bios_init_bus_rec(pci_bus, &pci_bus); + pci_bios_init_bus_rec(pci_bus, &pci_bus, domain_nr); } } }
- /**************************************************************** * Bus sizing ****************************************************************/
static void pci_bios_get_bar(struct pci_device *pci, int bar, - int *ptype, u64 *psize, int *pis64) + int *ptype, u64 *psize, int *pis64, + int domain_nr) { u32 ofs = pci_bar(pci, bar); u16 bdf = pci->bdf; - u32 old = pci_config_readl(bdf, ofs); + u32 old = pci_config_readl_dom(bdf, ofs, domain_nr); int is64 = 0, type = PCI_REGION_TYPE_MEM; u64 mask;
if (bar == PCI_ROM_SLOT) { mask = PCI_ROM_ADDRESS_MASK; - pci_config_writel(bdf, ofs, mask); + pci_config_writel_dom(bdf, ofs, mask, domain_nr); } else { if (old & PCI_BASE_ADDRESS_SPACE_IO) { mask = PCI_BASE_ADDRESS_IO_MASK; @@ -697,15 +709,15 @@ pci_bios_get_bar(struct pci_device *pci, int bar, is64 = ((old & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == PCI_BASE_ADDRESS_MEM_TYPE_64); } - pci_config_writel(bdf, ofs, ~0); + pci_config_writel_dom(bdf, ofs, ~0, domain_nr); } - u64 val = pci_config_readl(bdf, ofs); - pci_config_writel(bdf, ofs, old); + u64 val = pci_config_readl_dom(bdf, ofs, domain_nr); + pci_config_writel_dom(bdf, ofs, old, domain_nr); if (is64) { - u32 hold = pci_config_readl(bdf, ofs + 4); - pci_config_writel(bdf, ofs + 4, ~0); - u32 high = pci_config_readl(bdf, ofs + 4); - pci_config_writel(bdf, ofs + 4, hold); + u32 hold = pci_config_readl_dom(bdf, ofs + 4, domain_nr); + pci_config_writel_dom(bdf, ofs + 4, ~0, domain_nr); + u32 high = pci_config_readl_dom(bdf, ofs + 4, domain_nr); + pci_config_writel_dom(bdf, ofs + 4, hold, domain_nr); val |= ((u64)high << 32); mask |= ((u64)0xffffffff << 32); *psize = (~(val & mask)) + 1; @@ -717,15 +729,20 @@ pci_bios_get_bar(struct pci_device *pci, int bar, }
static int pci_bios_bridge_region_is64(struct pci_region *r, - struct pci_device *pci, int type) + struct pci_device *pci, int type, + int domain_nr) { if (type != PCI_REGION_TYPE_PREFMEM) return 0; - u32 pmem = pci_config_readl(pci->bdf, PCI_PREF_MEMORY_BASE); + u32 pmem = pci_config_readl_dom(pci->bdf, PCI_PREF_MEMORY_BASE, + domain_nr); if (!pmem) { - pci_config_writel(pci->bdf, PCI_PREF_MEMORY_BASE, 0xfff0fff0); - pmem = pci_config_readl(pci->bdf, PCI_PREF_MEMORY_BASE); - pci_config_writel(pci->bdf, PCI_PREF_MEMORY_BASE, 0x0); + pci_config_writel_dom(pci->bdf, PCI_PREF_MEMORY_BASE, 0xfff0fff0, + domain_nr); + pmem = pci_config_readl_dom(pci->bdf, PCI_PREF_MEMORY_BASE, + domain_nr); + pci_config_writel_dom(pci->bdf, PCI_PREF_MEMORY_BASE, 0x0, + domain_nr); } if ((pmem & PCI_PREF_RANGE_TYPE_MASK) != PCI_PREF_RANGE_TYPE_64) return 0; @@ -801,13 +818,15 @@ pci_region_create_entry(struct pci_bus *bus, struct pci_device *dev, return entry; }
-static int pci_bus_hotplug_support(struct pci_bus *bus, u8 pcie_cap) +static int pci_bus_hotplug_support(struct pci_bus *bus, u8 pcie_cap, + int domain_nr) { u8 shpc_cap;
if (pcie_cap) { - u16 pcie_flags = pci_config_readw(bus->bus_dev->bdf, - pcie_cap + PCI_EXP_FLAGS); + u16 pcie_flags = pci_config_readw_dom(bus->bus_dev->bdf, + pcie_cap + PCI_EXP_FLAGS, + domain_nr); u8 port_type = ((pcie_flags & PCI_EXP_FLAGS_TYPE) >> (__builtin_ffs(PCI_EXP_FLAGS_TYPE) - 1)); u8 downstream_port = (port_type == PCI_EXP_TYPE_DOWNSTREAM) || @@ -826,7 +845,8 @@ static int pci_bus_hotplug_support(struct pci_bus *bus, u8 pcie_cap) return downstream_port && slot_implemented; }
- shpc_cap = pci_find_capability(bus->bus_dev->bdf, PCI_CAP_ID_SHPC, 0); + shpc_cap = pci_find_capability_dom(bus->bus_dev->bdf, PCI_CAP_ID_SHPC, 0, + domain_nr); return !!shpc_cap; }
@@ -835,7 +855,8 @@ static int pci_bus_hotplug_support(struct pci_bus *bus, u8 pcie_cap) * Note: disables bridge's window registers as a side effect. */ static int pci_bridge_has_region(struct pci_device *pci, - enum pci_region_type region_type) + enum pci_region_type region_type, + int domain_nr) { u8 base;
@@ -851,18 +872,20 @@ static int pci_bridge_has_region(struct pci_device *pci, return 1; }
- pci_config_writeb(pci->bdf, base, 0xFF); + pci_config_writeb_dom(pci->bdf, base, 0xFF, domain_nr);
- return pci_config_readb(pci->bdf, base) != 0; + return pci_config_readb_dom(pci->bdf, base, domain_nr) != 0; }
-static int pci_bios_check_devices(struct pci_bus *busses) +static int pci_bios_check_devices(struct pci_bus *busses, int domain_nr) { dprintf(1, "PCI: check devices\n");
// Calculate resources needed for regular (non-bus) devices. struct pci_device *pci; foreachpci(pci) { + if (pci->domain_nr != domain_nr) + continue; if (pci->class == PCI_CLASS_BRIDGE_PCI) busses[pci->secondary_bus].bus_dev = pci;
@@ -879,7 +902,7 @@ static int pci_bios_check_devices(struct pci_bus *busses) continue; int type, is64; u64 size; - pci_bios_get_bar(pci, i, &type, &size, &is64); + pci_bios_get_bar(pci, i, &type, &size, &is64, domain_nr); if (size == 0) continue;
@@ -909,14 +932,14 @@ static int pci_bios_check_devices(struct pci_bus *busses) parent = &busses[0]; int type; u16 bdf = s->bus_dev->bdf; - u8 pcie_cap = pci_find_capability(bdf, PCI_CAP_ID_EXP, 0); - u8 qemu_cap = pci_find_resource_reserve_capability(bdf); + u8 pcie_cap = pci_find_capability_dom(bdf, PCI_CAP_ID_EXP, 0, domain_nr); + u8 qemu_cap = pci_find_resource_reserve_capability(bdf, domain_nr);
- int hotplug_support = pci_bus_hotplug_support(s, pcie_cap); + int hotplug_support = pci_bus_hotplug_support(s, pcie_cap, domain_nr); for (type = 0; type < PCI_REGION_TYPE_COUNT; type++) { u64 align = (type == PCI_REGION_TYPE_IO) ? PCI_BRIDGE_IO_MIN : PCI_BRIDGE_MEM_MIN; - if (!pci_bridge_has_region(s->bus_dev, type)) + if (!pci_bridge_has_region(s->bus_dev, type, domain_nr)) continue; u64 size = 0; if (qemu_cap) { @@ -924,22 +947,25 @@ static int pci_bios_check_devices(struct pci_bus *busses) u64 tmp_size_64; switch(type) { case PCI_REGION_TYPE_IO: - tmp_size_64 = (pci_config_readl(bdf, qemu_cap + RES_RESERVE_IO) | - (u64)pci_config_readl(bdf, qemu_cap + RES_RESERVE_IO + 4) << 32); + tmp_size_64 = (pci_config_readl_dom(bdf, qemu_cap + RES_RESERVE_IO, domain_nr) | + (u64)pci_config_readl_dom(bdf, qemu_cap + RES_RESERVE_IO + 4, domain_nr) << 32); if (tmp_size_64 != (u64)-1) { size = tmp_size_64; } break; case PCI_REGION_TYPE_MEM: - tmp_size = pci_config_readl(bdf, qemu_cap + RES_RESERVE_MEM); + tmp_size = pci_config_readl_dom(bdf, qemu_cap + RES_RESERVE_MEM, domain_nr); if (tmp_size != (u32)-1) { size = tmp_size; } break; case PCI_REGION_TYPE_PREFMEM: - tmp_size = pci_config_readl(bdf, qemu_cap + RES_RESERVE_PREF_MEM_32); - tmp_size_64 = (pci_config_readl(bdf, qemu_cap + RES_RESERVE_PREF_MEM_64) | - (u64)pci_config_readl(bdf, qemu_cap + RES_RESERVE_PREF_MEM_64 + 4) << 32); + tmp_size = pci_config_readl_dom(bdf, qemu_cap + RES_RESERVE_PREF_MEM_32, + domain_nr); + tmp_size_64 = (pci_config_readl_dom(bdf, qemu_cap + RES_RESERVE_PREF_MEM_64, + domain_nr) | + (u64)pci_config_readl_dom(bdf, qemu_cap + RES_RESERVE_PREF_MEM_64 + 4, + domain_nr) << 32); if (tmp_size != (u32)-1 && tmp_size_64 == (u64)-1) { size = tmp_size; } else if (tmp_size == (u32)-1 && tmp_size_64 != (u64)-1) { @@ -970,7 +996,7 @@ static int pci_bios_check_devices(struct pci_bus *busses) size = ALIGN(sum, align); } int is64 = pci_bios_bridge_region_is64(&s->r[type], - s->bus_dev, type); + s->bus_dev, type, domain_nr); // entry->bar is -1 if the entry represents a bridge region struct pci_region_entry *entry = pci_region_create_entry( parent, s->bus_dev, -1, size, align, type, is64); @@ -1048,7 +1074,7 @@ static int pci_bios_init_root_regions_mem(struct pci_bus *bus) #define PCI_PREF_MEMORY_SHIFT 16
static void -pci_region_map_one_entry(struct pci_region_entry *entry, u64 addr) +pci_region_map_one_entry(struct pci_region_entry *entry, u64 addr, int domain_nr) { if (entry->bar >= 0) { dprintf(1, "PCI: map device bdf=%pP" @@ -1063,24 +1089,24 @@ pci_region_map_one_entry(struct pci_region_entry *entry, u64 addr) u16 bdf = entry->dev->bdf; u64 limit = addr + entry->size - 1; if (entry->type == PCI_REGION_TYPE_IO) { - pci_config_writeb(bdf, PCI_IO_BASE, addr >> PCI_IO_SHIFT); - pci_config_writew(bdf, PCI_IO_BASE_UPPER16, 0); - pci_config_writeb(bdf, PCI_IO_LIMIT, limit >> PCI_IO_SHIFT); - pci_config_writew(bdf, PCI_IO_LIMIT_UPPER16, 0); + pci_config_writeb_dom(bdf, PCI_IO_BASE, addr >> PCI_IO_SHIFT, domain_nr); + pci_config_writew_dom(bdf, PCI_IO_BASE_UPPER16, 0, domain_nr); + pci_config_writeb_dom(bdf, PCI_IO_LIMIT, limit >> PCI_IO_SHIFT, domain_nr); + pci_config_writew_dom(bdf, PCI_IO_LIMIT_UPPER16, 0, domain_nr); } if (entry->type == PCI_REGION_TYPE_MEM) { - pci_config_writew(bdf, PCI_MEMORY_BASE, addr >> PCI_MEMORY_SHIFT); - pci_config_writew(bdf, PCI_MEMORY_LIMIT, limit >> PCI_MEMORY_SHIFT); + pci_config_writew_dom(bdf, PCI_MEMORY_BASE, addr >> PCI_MEMORY_SHIFT, domain_nr); + pci_config_writew_dom(bdf, PCI_MEMORY_LIMIT, limit >> PCI_MEMORY_SHIFT, domain_nr); } if (entry->type == PCI_REGION_TYPE_PREFMEM) { - pci_config_writew(bdf, PCI_PREF_MEMORY_BASE, addr >> PCI_PREF_MEMORY_SHIFT); - pci_config_writew(bdf, PCI_PREF_MEMORY_LIMIT, limit >> PCI_PREF_MEMORY_SHIFT); - pci_config_writel(bdf, PCI_PREF_BASE_UPPER32, addr >> 32); - pci_config_writel(bdf, PCI_PREF_LIMIT_UPPER32, limit >> 32); + pci_config_writew_dom(bdf, PCI_PREF_MEMORY_BASE, addr >> PCI_PREF_MEMORY_SHIFT, domain_nr); + pci_config_writew_dom(bdf, PCI_PREF_MEMORY_LIMIT, limit >> PCI_PREF_MEMORY_SHIFT, domain_nr); + pci_config_writel_dom(bdf, PCI_PREF_BASE_UPPER32, addr >> 32, domain_nr); + pci_config_writel_dom(bdf, PCI_PREF_LIMIT_UPPER32, limit >> 32, domain_nr); } }
-static void pci_region_map_entries(struct pci_bus *busses, struct pci_region *r) +static void pci_region_map_entries(struct pci_bus *busses, struct pci_region *r, int domain_nr) { struct hlist_node *n; struct pci_region_entry *entry; @@ -1090,13 +1116,13 @@ static void pci_region_map_entries(struct pci_bus *busses, struct pci_region *r) if (entry->bar == -1) // Update bus base address if entry is a bridge region busses[entry->dev->secondary_bus].r[entry->type].base = addr; - pci_region_map_one_entry(entry, addr); + pci_region_map_one_entry(entry, addr, domain_nr); hlist_del(&entry->node); free(entry); } }
-static void pci_bios_map_devices(struct pci_bus *busses) +static void pci_bios_map_devices(struct pci_bus *busses, int domain_nr) { if (pci_bios_init_root_regions_io(busses)) panic("PCI: out of I/O address space\n"); @@ -1127,13 +1153,13 @@ static void pci_bios_map_devices(struct pci_bus *busses) r64_pref.base = r64_mem.base + sum_mem; r64_pref.base = ALIGN(r64_pref.base, align_pref); r64_pref.base = ALIGN(r64_pref.base, (1LL<<30)); // 1G hugepage - pcimem64_start = r64_mem.base; - pcimem64_end = r64_pref.base + sum_pref; + pcimem64_start = r64_mem.base + pxb_mcfg_size; + pcimem64_end = r64_pref.base + sum_pref + pxb_mcfg_size; pcimem64_end = ALIGN(pcimem64_end, (1LL<<30)); // 1G hugepage dprintf(1, "PCI: 64: %016llx - %016llx\n", pcimem64_start, pcimem64_end);
- pci_region_map_entries(busses, &r64_mem); - pci_region_map_entries(busses, &r64_pref); + pci_region_map_entries(busses, &r64_mem, domain_nr); + pci_region_map_entries(busses, &r64_pref, domain_nr); } else { // no bars mapped high -> drop 64bit window (see dsdt) pcimem64_start = 0; @@ -1143,7 +1169,7 @@ static void pci_bios_map_devices(struct pci_bus *busses) for (bus = 0; bus<=MaxPCIBus; bus++) { int type; for (type = 0; type < PCI_REGION_TYPE_COUNT; type++) - pci_region_map_entries(busses, &busses[bus].r[type]); + pci_region_map_entries(busses, &busses[bus].r[type], domain_nr); } }
@@ -1164,30 +1190,35 @@ pci_setup(void) if (pci_probe_host() != 0) { return; } - pci_bios_init_bus();
- dprintf(1, "=== PCI device probing ===\n"); - pci_probe_devices(); + u8 extraroots = romfile_loadint("etc/extra-pci-roots", 0); + int domain_nr; + /* q35 host is in domain 0, pxb hosts in domain >= 1*/ + for (domain_nr = 0; domain_nr <= extraroots; ++domain_nr) { + pci_bios_init_bus(domain_nr);
- pcimem_start = RamSize; - pci_bios_init_platform(); + dprintf(1, "=== PCI device probing ===\n"); + pci_probe_devices(domain_nr);
- dprintf(1, "=== PCI new allocation pass #1 ===\n"); - struct pci_bus *busses = malloc_tmp(sizeof(*busses) * (MaxPCIBus + 1)); - if (!busses) { - warn_noalloc(); - return; + pcimem_start = RamSize; + pci_bios_init_platform(domain_nr); + + dprintf(1, "=== [domain %d] PCI new allocation pass #1 ===\n", domain_nr); + struct pci_bus *busses = malloc_tmp(sizeof(*busses) * (MaxPCIBus + 1)); + if (!busses) { + warn_noalloc(); + return; + } + memset(busses, 0, sizeof(*busses) * (MaxPCIBus + 1)); + if (pci_bios_check_devices(busses, domain_nr)) + return; + + dprintf(1, "=== [domain %d] PCI new allocation pass #2 ===\n", domain_nr); + pci_bios_map_devices(busses, domain_nr); + + pci_bios_init_devices(domain_nr); + free(busses); } - memset(busses, 0, sizeof(*busses) * (MaxPCIBus + 1)); - if (pci_bios_check_devices(busses)) - return; - - dprintf(1, "=== PCI new allocation pass #2 ===\n"); - pci_bios_map_devices(busses); - - pci_bios_init_devices(); - - free(busses);
pci_enable_default_vga(); } diff --git a/src/hw/pci.c b/src/hw/pci.c index 9855bad..cc1b6ec 100644 --- a/src/hw/pci.c +++ b/src/hw/pci.c @@ -11,72 +11,75 @@ #include "util.h" // udelay #include "x86.h" // outl
-#define PORT_PCI_CMD 0x0cf8 -#define PORT_PCI_DATA 0x0cfc - -void pci_config_writel(u16 bdf, u32 addr, u32 val) +void pci_config_writel_dom(u16 bdf, u32 addr, u32 val, int domain_nr) { - outl(0x80000000 | (bdf << 8) | (addr & 0xfc), PORT_PCI_CMD); - outl(val, PORT_PCI_DATA); + outl(0x80000000 | (bdf << 8) | (addr & 0xfc), + domain_nr ? PORT_PXB_CMD_BASE + ((domain_nr - 1) << 3) : PORT_PCI_CMD); + outl(val, (domain_nr ? PORT_PXB_DATA_BASE + ((domain_nr - 1) << 3) : PORT_PCI_DATA)); }
-void pci_config_writew(u16 bdf, u32 addr, u16 val) +void pci_config_writew_dom(u16 bdf, u32 addr, u16 val, int domain_nr) { - outl(0x80000000 | (bdf << 8) | (addr & 0xfc), PORT_PCI_CMD); - outw(val, PORT_PCI_DATA + (addr & 2)); + outl(0x80000000 | (bdf << 8) | (addr & 0xfc), + domain_nr == 0 ? PORT_PCI_CMD : PORT_PXB_CMD_BASE + (domain_nr << 3)); + outw(val, (domain_nr ? PORT_PXB_DATA_BASE + ((domain_nr - 1) << 3) : PORT_PCI_DATA) + (addr & 2)); }
-void pci_config_writeb(u16 bdf, u32 addr, u8 val) +void pci_config_writeb_dom(u16 bdf, u32 addr, u8 val, int domain_nr) { - outl(0x80000000 | (bdf << 8) | (addr & 0xfc), PORT_PCI_CMD); - outb(val, PORT_PCI_DATA + (addr & 3)); + outl(0x80000000 | (bdf << 8) | (addr & 0xfc), + domain_nr ? PORT_PXB_CMD_BASE + ((domain_nr - 1) << 3) : PORT_PCI_CMD); + outb(val, (domain_nr ? PORT_PXB_DATA_BASE + ((domain_nr - 1) << 3) : PORT_PCI_DATA) + (addr & 3)); }
-u32 pci_config_readl(u16 bdf, u32 addr) +u32 pci_config_readl_dom(u16 bdf, u32 addr, int domain_nr) { - outl(0x80000000 | (bdf << 8) | (addr & 0xfc), PORT_PCI_CMD); - return inl(PORT_PCI_DATA); + outl(0x80000000 | (bdf << 8) | (addr & 0xfc), + domain_nr ? PORT_PXB_CMD_BASE + ((domain_nr - 1) << 3) : PORT_PCI_CMD); + return inl((domain_nr ? PORT_PXB_DATA_BASE + ((domain_nr - 1) << 3) : PORT_PCI_DATA)); }
-u16 pci_config_readw(u16 bdf, u32 addr) +u16 pci_config_readw_dom(u16 bdf, u32 addr, int domain_nr) { - outl(0x80000000 | (bdf << 8) | (addr & 0xfc), PORT_PCI_CMD); - return inw(PORT_PCI_DATA + (addr & 2)); + outl(0x80000000 | (bdf << 8) | (addr & 0xfc), + domain_nr ? PORT_PXB_CMD_BASE + ((domain_nr - 1) << 3) : PORT_PCI_CMD); + return inw((domain_nr ? PORT_PXB_DATA_BASE + ((domain_nr - 1) << 3) : PORT_PCI_DATA) + (addr & 2)); }
-u8 pci_config_readb(u16 bdf, u32 addr) +u8 pci_config_readb_dom(u16 bdf, u32 addr, int domain_nr) { - outl(0x80000000 | (bdf << 8) | (addr & 0xfc), PORT_PCI_CMD); - return inb(PORT_PCI_DATA + (addr & 3)); + outl(0x80000000 | (bdf << 8) | (addr & 0xfc), + domain_nr ? PORT_PXB_CMD_BASE + ((domain_nr - 1) << 3) : PORT_PCI_CMD); + return inb((domain_nr ? PORT_PXB_DATA_BASE + ((domain_nr - 1) << 3) : PORT_PCI_DATA) + (addr & 3)); }
void -pci_config_maskw(u16 bdf, u32 addr, u16 off, u16 on) +pci_config_maskw_dom(u16 bdf, u32 addr, u16 off, u16 on, int domain_nr) { - u16 val = pci_config_readw(bdf, addr); + u16 val = pci_config_readw_dom(bdf, addr, domain_nr); val = (val & ~off) | on; - pci_config_writew(bdf, addr, val); + pci_config_writew_dom(bdf, addr, val, domain_nr); }
-u8 pci_find_capability(u16 bdf, u8 cap_id, u8 cap) +u8 pci_find_capability_dom(u16 bdf, u8 cap_id, u8 cap, int domain_nr) { int i; - u16 status = pci_config_readw(bdf, PCI_STATUS); + u16 status = pci_config_readw_dom(bdf, PCI_STATUS, domain_nr);
if (!(status & PCI_STATUS_CAP_LIST)) return 0;
if (cap == 0) { /* find first */ - cap = pci_config_readb(bdf, PCI_CAPABILITY_LIST); + cap = pci_config_readb_dom(bdf, PCI_CAPABILITY_LIST, domain_nr); } else { /* find next */ - cap = pci_config_readb(bdf, cap + PCI_CAP_LIST_NEXT); + cap = pci_config_readb_dom(bdf, cap + PCI_CAP_LIST_NEXT, domain_nr); } for (i = 0; cap && i <= 0xff; i++) { - if (pci_config_readb(bdf, cap + PCI_CAP_LIST_ID) == cap_id) + if (pci_config_readb_dom(bdf, cap + PCI_CAP_LIST_ID, domain_nr) == cap_id) return cap; - cap = pci_config_readb(bdf, cap + PCI_CAP_LIST_NEXT); + cap = pci_config_readb_dom(bdf, cap + PCI_CAP_LIST_NEXT, domain_nr); }
return 0; @@ -84,10 +87,10 @@ u8 pci_find_capability(u16 bdf, u8 cap_id, u8 cap)
// Helper function for foreachbdf() macro - return next device int -pci_next(int bdf, int bus) +pci_next_dom(int bdf, int bus, int domain_nr) { if (pci_bdf_to_fn(bdf) == 0 - && (pci_config_readb(bdf, PCI_HEADER_TYPE) & 0x80) == 0) + && (pci_config_readb_dom(bdf, PCI_HEADER_TYPE, domain_nr) & 0x80) == 0) // Last found device wasn't a multi-function device - skip to // the next device. bdf += 8; @@ -98,7 +101,7 @@ pci_next(int bdf, int bus) if (pci_bdf_to_bus(bdf) != bus) return -1;
- u16 v = pci_config_readw(bdf, PCI_VENDOR_ID); + u16 v = pci_config_readw_dom(bdf, PCI_VENDOR_ID, domain_nr); if (v != 0x0000 && v != 0xffff) // Device is present. return bdf; diff --git a/src/hw/pci.h b/src/hw/pci.h index 2e30e28..4381563 100644 --- a/src/hw/pci.h +++ b/src/hw/pci.h @@ -3,7 +3,11 @@
#include "types.h" // u32
+#define PORT_PCI_CMD 0x0cf8 #define PORT_PCI_REBOOT 0x0cf9 +#define PORT_PCI_DATA 0x0cfc +#define PORT_PXB_CMD_BASE 0x1000 +#define PORT_PXB_DATA_BASE 0x1004
static inline u8 pci_bdf_to_bus(u16 bdf) { return bdf >> 8; @@ -27,20 +31,34 @@ static inline u16 pci_bus_devfn_to_bdf(int bus, u16 devfn) { return (bus << 8) | devfn; }
-#define foreachbdf(BDF, BUS) \ - for (BDF=pci_next(pci_bus_devfn_to_bdf((BUS), 0)-1, (BUS)) \ +/* for compatibility */ +#define foreachbdf(BDF, BUS) foreachbdf_dom(BDF, BUS, 0) + +#define foreachbdf_dom(BDF, BUS, DOMAIN) \ + for (BDF=pci_next_dom(pci_bus_devfn_to_bdf((BUS), 0)-1, (BUS), (DOMAIN)) \ ; BDF >= 0 \ - ; BDF=pci_next(BDF, (BUS))) + ; BDF=pci_next_dom(BDF, (BUS), (DOMAIN)))
-void pci_config_writel(u16 bdf, u32 addr, u32 val); -void pci_config_writew(u16 bdf, u32 addr, u16 val); -void pci_config_writeb(u16 bdf, u32 addr, u8 val); -u32 pci_config_readl(u16 bdf, u32 addr); -u16 pci_config_readw(u16 bdf, u32 addr); -u8 pci_config_readb(u16 bdf, u32 addr); -void pci_config_maskw(u16 bdf, u32 addr, u16 off, u16 on); -u8 pci_find_capability(u16 bdf, u8 cap_id, u8 cap); -int pci_next(int bdf, int bus); +#define pci_config_maskw(BDF, ADDR, OFF, ON) pci_config_maskw_dom((BDF), (ADDR), (OFF), (ON), 0) +#define pci_find_capability(BDF, CAP_ID, CAP) pci_find_capability_dom((BDF), (CAP_ID), (CAP), 0) +#define pci_next(BDF, BUS) pci_next_dom((BDF), (BUS), 0) + +#define pci_config_writel(BDF, ADDR, VAL) pci_config_writel_dom((BDF), (ADDR), (VAL), 0) +#define pci_config_writew(BDF, ADDR, VAL) pci_config_writew_dom((BDF), (ADDR), (VAL), 0) +#define pci_config_writeb(BDF, ADDR, VAL) pci_config_writeb_dom((BDF), (ADDR), (VAL), 0) +#define pci_config_readl(BDF, ADDR) pci_config_readl_dom((BDF), (ADDR), 0) +#define pci_config_readw(BDF, ADDR) pci_config_readw_dom((BDF), (ADDR), 0) +#define pci_config_readb(BDF, ADDR) pci_config_readb_dom((BDF), (ADDR), 0) + +void pci_config_writel_dom(u16 bdf, u32 addr, u32 val, int domain_nr); +void pci_config_writew_dom(u16 bdf, u32 addr, u16 val, int domain_nr); +void pci_config_writeb_dom(u16 bdf, u32 addr, u8 val, int domain_nr); +u32 pci_config_readl_dom(u16 bdf, u32 addr, int domain_nr); +u16 pci_config_readw_dom(u16 bdf, u32 addr, int domain_nr); +u8 pci_config_readb_dom(u16 bdf, u32 addr, int domain_nr); +void pci_config_maskw_dom(u16 bdf, u32 addr, u16 off, u16 on, int domain_nr); +u8 pci_find_capability_dom(u16 bdf, u8 cap_id, u8 cap, int domain_nr); +int pci_next_dom(int bdf, int bus, int domain_nr); int pci_probe_host(void); void pci_reboot(void);
diff --git a/src/hw/pci_ids.h b/src/hw/pci_ids.h index 35096ea..1d4ddf6 100644 --- a/src/hw/pci_ids.h +++ b/src/hw/pci_ids.h @@ -2263,9 +2263,10 @@ #define PCI_DEVICE_ID_KORENIX_JETCARDF0 0x1600 #define PCI_DEVICE_ID_KORENIX_JETCARDF1 0x16ff
-#define PCI_VENDOR_ID_REDHAT 0x1b36 -#define PCI_DEVICE_ID_REDHAT_ROOT_PORT 0x000C -#define PCI_DEVICE_ID_REDHAT_PXB_HOST 0x000B +#define PCI_VENDOR_ID_REDHAT 0x1b36 +#define PCI_DEVICE_ID_REDHAT_PXB_HOST 0x000B +#define PCI_DEVICE_ID_REDHAT_ROOT_PORT 0x000C +#define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE 0x000E
#define PCI_VENDOR_ID_TEKRAM 0x1de1 #define PCI_DEVICE_ID_TEKRAM_DC290 0xdc29 diff --git a/src/hw/pcidevice.c b/src/hw/pcidevice.c index 8853cf7..ec21ec1 100644 --- a/src/hw/pcidevice.c +++ b/src/hw/pcidevice.c @@ -15,10 +15,11 @@
struct hlist_head PCIDevices VARVERIFY32INIT; int MaxPCIBus VARFSEG; +int PXBHosts VARFSEG;
// Find all PCI devices and populate PCIDevices linked list. void -pci_probe_devices(void) +pci_probe_devices(int domain_nr) { dprintf(3, "PCI probe\n"); struct pci_device *busdevs[256]; @@ -29,7 +30,7 @@ pci_probe_devices(void) while (bus < 0xff && (bus < MaxPCIBus || rootbuses < extraroots)) { bus++; int bdf; - foreachbdf(bdf, bus) { + foreachbdf_dom(bdf, bus, domain_nr) { // Create new pci_device struct and add to list. struct pci_device *dev = malloc_tmp(sizeof(*dev)); if (!dev) { @@ -56,6 +57,7 @@ pci_probe_devices(void) }
// Populate pci_device info. + dev->domain_nr = domain_nr; dev->bdf = bdf; dev->parent = parent; dev->rootbus = rootbus; @@ -69,7 +71,7 @@ pci_probe_devices(void) dev->header_type = pci_config_readb(bdf, PCI_HEADER_TYPE); u8 v = dev->header_type & 0x7f; if (v == PCI_HEADER_TYPE_BRIDGE || v == PCI_HEADER_TYPE_CARDBUS) { - u8 secbus = pci_config_readb(bdf, PCI_SECONDARY_BUS); + u8 secbus = pci_config_readb_dom(bdf, PCI_SECONDARY_BUS, domain_nr); dev->secondary_bus = secbus; if (secbus > bus && !busdevs[secbus]) busdevs[secbus] = dev; diff --git a/src/hw/pcidevice.h b/src/hw/pcidevice.h index 225d545..951e005 100644 --- a/src/hw/pcidevice.h +++ b/src/hw/pcidevice.h @@ -5,6 +5,7 @@ #include "list.h" // hlist_node
struct pci_device { + u32 domain_nr; u16 bdf; u8 rootbus; struct hlist_node node; @@ -22,6 +23,7 @@ struct pci_device { }; extern struct hlist_head PCIDevices; extern int MaxPCIBus; +extern int PXBHosts;
static inline u32 pci_classprog(struct pci_device *pci) { return (pci->class << 8) | pci->prog_if; @@ -62,7 +64,7 @@ struct pci_device_id { .vendid = 0, \ }
-void pci_probe_devices(void); +void pci_probe_devices(int domain_nr); struct pci_device *pci_find_device(u16 vendid, u16 devid); struct pci_device *pci_find_class(u16 classid); int pci_init_device(const struct pci_device_id *ids
On 08/09/2018 08:43 AM, Zihan Yang wrote:
It is not related only to q35. Is about PCI hosts bridges others that the main one.
It seems is a new function, but I can't find the definition. Can you please point me to it?
You may want to split mcfg chunks in a different patch.
It doesn't look like an optimal solution. It should look maybe something like:
foreach(domain) foreach(pci)
Same thing here. We can have multiple PCI root buses in the same PCI domain.
Same here. We traverse all the pci devices for each domain.
Not always true. We may still have pci pxb hosts residing in PCI domain 0. You can add a new fw_config to give a hint about how many PCI domains we have (like we have /etx/extra-pci-roots), or tweak the current one to add this information, or maybe add a vendor specific capability to the pxb-pcie bridge to expose its domain number.
If you want to move the declarations, please do it in a different patch. This on is already too big. But, as I previously mentioned, maybe we don't need legacy IO support and MMCFG only configuration is good enough.
If I understand correctly, the only way SeaBIOS lets us configure the devices is using the 0xcf8/0xcfc registers. Since we don't want at this point to support random IO ports for each PCI domain, maybe we can try a different angle:
We don't have to configure the PCI devices residing in PCI domain > 0. The only drawback is we won't be able to boot from a PCI device belonging to such PCI domain, and maybe is OK.
What we need from SeaBIOS is to 'assign' enough address space for each MMCFG and return their addresses to QEMU. The QEMU can create the ACPI tables and let the guest OS configure the PCI devices.
The problem remains the computation of the actual IO/MEM resources needed by these devices. (Not the MMCFG table). If SeaBIOS can't reach the PCI devices, it can't compute the needed resources, so QEMU can't divide the IO/MEM address space between the PCI domains.
Any idea would be welcomed.
Thanks, Marcel
[...]
Marcel Apfelbaum marcel.apfelbaum@gmail.com 于2018年8月25日周六 下午7:53写道:
I see, I will correct it.
This is in last patch (1/3), which initializes pxb-pcie devices.
OK, I will split in next version.
OK, I will change that, but all pci devices are linked together regardless of its domain, such implementaion would also need to modify the foreach macro.
OK.
OK, I will try that in nexr version.
I see, I will use only mmio in later patches, after we resolve the design issue.
Yes, the MMCFG of q35 is hardcoded so we can return its address to qemu before loading RSDP, but pxb-pcie host does not have fixed base address or size. Its base address is after ram_above_4g, and the size depends on the desired bus numbers, which is not necessarily 256.
If we don't reserve some space for MMCFG, seabios would have to fetch the address and size from qemu through methods like ioport read, but that would make things complicated because we need to use different port for new host buses other than q35 host bus(pcie.0).
Welcomed by me too.
Since pci devices could reside in different domains now, we should judge the domain of pci devices when traversing them. Original devices still use domain 0 for compatibility
Signed-off-by: Zihan Yang whois.zihan.yang@gmail.com --- src/fw/mptable.c | 1 + src/fw/pciinit.c | 10 ++++------ src/hw/ahci.c | 1 + src/hw/ata.c | 1 + src/hw/esp-scsi.c | 1 + src/hw/lsi-scsi.c | 1 + src/hw/megasas.c | 1 + src/hw/mpt-scsi.c | 1 + src/hw/nvme.c | 1 + src/hw/pcidevice.c | 3 +++ src/hw/pcidevice.h | 4 ++++ src/hw/pvscsi.c | 1 + src/hw/sdcard.c | 1 + src/hw/usb-ehci.c | 1 + src/hw/usb-ohci.c | 1 + src/hw/usb-uhci.c | 1 + src/hw/usb-xhci.c | 1 + src/hw/virtio-blk.c | 1 + src/hw/virtio-scsi.c | 1 + src/optionroms.c | 3 +++ 20 files changed, 30 insertions(+), 6 deletions(-)
diff --git a/src/fw/mptable.c b/src/fw/mptable.c index 47385cc..3989cb6 100644 --- a/src/fw/mptable.c +++ b/src/fw/mptable.c @@ -110,6 +110,7 @@ mptable_setup(void)
struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); u16 bdf = pci->bdf; if (pci_bdf_to_bus(bdf) != 0) break; diff --git a/src/fw/pciinit.c b/src/fw/pciinit.c index fcdcd38..834540f 100644 --- a/src/fw/pciinit.c +++ b/src/fw/pciinit.c @@ -427,8 +427,7 @@ static void pci_bios_init_devices(int domain_nr) { struct pci_device *pci; foreachpci(pci) { - if (pci->domain_nr != domain_nr) - continue; + filter_domain(pci, domain_nr); pci_bios_init_device(pci); } } @@ -438,6 +437,7 @@ static void pci_enable_default_vga(void) struct pci_device *pci;
foreachpci(pci) { + filter_domain(pci, 0); if (is_pci_vga(pci)) { dprintf(1, "PCI: Using %pP for primary VGA\n", pci); return; @@ -545,8 +545,7 @@ static void pci_bios_init_platform(int domain_nr) { struct pci_device *pci; foreachpci(pci) { - if (pci->domain_nr != domain_nr) - continue; + filter_domain(pci, domain_nr); pci_init_device(pci_platform_tbl, pci, NULL); } } @@ -884,8 +883,7 @@ static int pci_bios_check_devices(struct pci_bus *busses, int domain_nr) // Calculate resources needed for regular (non-bus) devices. struct pci_device *pci; foreachpci(pci) { - if (pci->domain_nr != domain_nr) - continue; + filter_domain(pci, domain_nr); if (pci->class == PCI_CLASS_BRIDGE_PCI) busses[pci->secondary_bus].bus_dev = pci;
diff --git a/src/hw/ahci.c b/src/hw/ahci.c index 1746e7a..f825992 100644 --- a/src/hw/ahci.c +++ b/src/hw/ahci.c @@ -677,6 +677,7 @@ ahci_scan(void) // Scan PCI bus for ATA adapters struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci->class != PCI_CLASS_STORAGE_SATA) continue; if (pci->prog_if != 1 /* AHCI rev 1 */) diff --git a/src/hw/ata.c b/src/hw/ata.c index b6e073c..2273326 100644 --- a/src/hw/ata.c +++ b/src/hw/ata.c @@ -1024,6 +1024,7 @@ ata_scan(void) // Scan PCI bus for ATA adapters struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); pci_init_device(pci_ata_tbl, pci, NULL); } } diff --git a/src/hw/esp-scsi.c b/src/hw/esp-scsi.c index ffd86d0..17436d5 100644 --- a/src/hw/esp-scsi.c +++ b/src/hw/esp-scsi.c @@ -233,6 +233,7 @@ esp_scsi_setup(void)
struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci->vendor != PCI_VENDOR_ID_AMD || pci->device != PCI_DEVICE_ID_AMD_SCSI) continue; diff --git a/src/hw/lsi-scsi.c b/src/hw/lsi-scsi.c index d5fc3e4..5748d1f 100644 --- a/src/hw/lsi-scsi.c +++ b/src/hw/lsi-scsi.c @@ -213,6 +213,7 @@ lsi_scsi_setup(void)
struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci->vendor != PCI_VENDOR_ID_LSI_LOGIC || pci->device != PCI_DEVICE_ID_LSI_53C895A) continue; diff --git a/src/hw/megasas.c b/src/hw/megasas.c index d267580..1d84771 100644 --- a/src/hw/megasas.c +++ b/src/hw/megasas.c @@ -386,6 +386,7 @@ megasas_setup(void)
struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci->vendor != PCI_VENDOR_ID_LSI_LOGIC && pci->vendor != PCI_VENDOR_ID_DELL) continue; diff --git a/src/hw/mpt-scsi.c b/src/hw/mpt-scsi.c index 1faede6..e89316b 100644 --- a/src/hw/mpt-scsi.c +++ b/src/hw/mpt-scsi.c @@ -310,6 +310,7 @@ mpt_scsi_setup(void)
struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci->vendor == PCI_VENDOR_ID_LSI_LOGIC && (pci->device == PCI_DEVICE_ID_LSI_53C1030 || pci->device == PCI_DEVICE_ID_LSI_SAS1068 diff --git a/src/hw/nvme.c b/src/hw/nvme.c index e6d739d..d7b5183 100644 --- a/src/hw/nvme.c +++ b/src/hw/nvme.c @@ -633,6 +633,7 @@ nvme_scan(void) struct pci_device *pci;
foreachpci(pci) { + filter_domain(pci, 0); if (pci->class != PCI_CLASS_STORAGE_NVME) continue; if (pci->prog_if != 2 /* as of NVM 1.0e */) { diff --git a/src/hw/pcidevice.c b/src/hw/pcidevice.c index ec21ec1..44dc05a 100644 --- a/src/hw/pcidevice.c +++ b/src/hw/pcidevice.c @@ -91,6 +91,7 @@ pci_find_device(u16 vendid, u16 devid) { struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci->vendor == vendid && pci->device == devid) return pci; } @@ -103,6 +104,7 @@ pci_find_class(u16 classid) { struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci->class == classid) return pci; } @@ -130,6 +132,7 @@ pci_find_init_device(const struct pci_device_id *ids, void *arg) { struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci_init_device(ids, pci, arg) == 0) return pci; } diff --git a/src/hw/pcidevice.h b/src/hw/pcidevice.h index 951e005..4518e77 100644 --- a/src/hw/pcidevice.h +++ b/src/hw/pcidevice.h @@ -32,6 +32,10 @@ static inline u32 pci_classprog(struct pci_device *pci) { #define foreachpci(PCI) \ hlist_for_each_entry(PCI, &PCIDevices, node)
+#define filter_domain(PCI, DOMAIN) \ + if ((PCI)->domain_nr != (DOMAIN)) \ + continue; + #define PCI_ANY_ID (~0) struct pci_device_id { u32 vendid; diff --git a/src/hw/pvscsi.c b/src/hw/pvscsi.c index d62d0a0..d0f6dac 100644 --- a/src/hw/pvscsi.c +++ b/src/hw/pvscsi.c @@ -325,6 +325,7 @@ pvscsi_setup(void)
struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci->vendor != PCI_VENDOR_ID_VMWARE || pci->device != PCI_DEVICE_ID_VMWARE_PVSCSI) continue; diff --git a/src/hw/sdcard.c b/src/hw/sdcard.c index 6410340..f3782f2 100644 --- a/src/hw/sdcard.c +++ b/src/hw/sdcard.c @@ -564,6 +564,7 @@ sdcard_setup(void)
struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci->class != PCI_CLASS_SYSTEM_SDHCI || pci->prog_if >= 2) // Not an SDHCI controller following SDHCI spec continue; diff --git a/src/hw/usb-ehci.c b/src/hw/usb-ehci.c index 7eca55b..60b73b2 100644 --- a/src/hw/usb-ehci.c +++ b/src/hw/usb-ehci.c @@ -331,6 +331,7 @@ ehci_setup(void) return; struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci_classprog(pci) == PCI_CLASS_SERIAL_USB_EHCI) ehci_controller_setup(pci); } diff --git a/src/hw/usb-ohci.c b/src/hw/usb-ohci.c index 90f60e6..c25745f 100644 --- a/src/hw/usb-ohci.c +++ b/src/hw/usb-ohci.c @@ -302,6 +302,7 @@ ohci_setup(void) return; struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci_classprog(pci) == PCI_CLASS_SERIAL_USB_OHCI) ohci_controller_setup(pci); } diff --git a/src/hw/usb-uhci.c b/src/hw/usb-uhci.c index 075ed02..f92b417 100644 --- a/src/hw/usb-uhci.c +++ b/src/hw/usb-uhci.c @@ -275,6 +275,7 @@ uhci_setup(void) return; struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci_classprog(pci) == PCI_CLASS_SERIAL_USB_UHCI) uhci_controller_setup(pci); } diff --git a/src/hw/usb-xhci.c b/src/hw/usb-xhci.c index 08d1e32..9293720 100644 --- a/src/hw/usb-xhci.c +++ b/src/hw/usb-xhci.c @@ -631,6 +631,7 @@ xhci_setup(void) return; struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci_classprog(pci) == PCI_CLASS_SERIAL_USB_XHCI) xhci_controller_setup(pci); } diff --git a/src/hw/virtio-blk.c b/src/hw/virtio-blk.c index 88d7e54..2a98303 100644 --- a/src/hw/virtio-blk.c +++ b/src/hw/virtio-blk.c @@ -202,6 +202,7 @@ virtio_blk_setup(void)
struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci->vendor != PCI_VENDOR_ID_REDHAT_QUMRANET || (pci->device != PCI_DEVICE_ID_VIRTIO_BLK_09 && pci->device != PCI_DEVICE_ID_VIRTIO_BLK_10)) diff --git a/src/hw/virtio-scsi.c b/src/hw/virtio-scsi.c index a87cad8..b77680e 100644 --- a/src/hw/virtio-scsi.c +++ b/src/hw/virtio-scsi.c @@ -211,6 +211,7 @@ virtio_scsi_setup(void)
struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci->vendor != PCI_VENDOR_ID_REDHAT_QUMRANET || (pci->device != PCI_DEVICE_ID_VIRTIO_SCSI_09 && pci->device != PCI_DEVICE_ID_VIRTIO_SCSI_10)) diff --git a/src/optionroms.c b/src/optionroms.c index fc992f6..527b7cb 100644 --- a/src/optionroms.c +++ b/src/optionroms.c @@ -350,6 +350,7 @@ optionrom_setup(void) // Find and deploy PCI roms. struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (pci->class == PCI_CLASS_DISPLAY_VGA || pci->class == PCI_CLASS_DISPLAY_OTHER || pci->have_driver) @@ -409,6 +410,7 @@ static void try_setup_display_other(void) dprintf(1, "No VGA found, scan for other display\n");
foreachpci(pci) { + filter_domain(pci, 0); if (pci->class != PCI_CLASS_DISPLAY_OTHER) continue; struct rom_header *rom = map_pcirom(pci); @@ -445,6 +447,7 @@ vgarom_setup(void) // Find and deploy PCI VGA rom. struct pci_device *pci; foreachpci(pci) { + filter_domain(pci, 0); if (!is_pci_vga(pci)) continue; vgahook_setup(pci);
Hi,
On 08/09/2018 08:43 AM, Zihan Yang wrote:
This is a necessary addition to support your QEMU patches, Please send a link to the QEMU series on your next re-spin.
That is not necessarily true. As we discussed in QEMU devel mailing list, it is possible PCI buses of a different domain to start from a positive bus number. Both bus_nr and domain_nr support makes sense.
I would skip support for IO based configuration and use only MMCONFIG for extra root buses.
The question remains: how do we assign MMCONFIG space for each PCI domain.
Thanks, Marcel
Marcel Apfelbaum marcel.apfelbaum@gmail.com 于2018年8月25日周六 下午4:07写道:
OK, I will attach the link in later QEMU patches.
OK, I think I will still use bus_nr as the start bus when in separate domain.
Indeed, seabios does not have fixed MMCONFIG space(except for q35 host) yet.
Hi,
Allocation-wise it would be easiest to place them above 4G. Right after memory, or after etc/reserved-memory-end (if that fw_cfg file is present), where the 64bit pci bars would have been placed. Move the pci bars up in address space to make room.
Only problem is that seabios wouldn't be able to access mmconfig then.
Placing them below 4G would work at least for a few pci domains. q35 mmconfig bar is placed at 0xb0000000 -> 0xbfffffff, basically for historical reasons. Old qemu versions had 2.75G low memory on q35 (up to 0xafffffff), and I think old machine types still have that for live migration compatibility reasons. Modern qemu uses 2G only, to make gigabyte alignment work.
32bit pci bars are placed above 0xc0000000. The address space from 2G to 2.75G (0x8000000 -> 0xafffffff) is unused on new machine types. Enough room for three additional mmconfig bars (full size), so four pci domains total if you add the q35 one.
cheers, Gerd