This series introduces a new device - Generic PCI Express to PCI bridge, and also makes all necessary changes to enable hotplug of the bridge itself and any device into the bridge.
Changes v4->v5: 1. Change PCIE-PCI Bridge license (addresses Marcel's comment) 2. The capability layout changes (adress Laszlo' comments): - separate pref_mem into pref_mem_32 and pref_mem_64 fields (SeaBIOS side has the same changes) - accordingly change the Generic Root Port's properties 3. Do not add the capability to the root port if no valid values are provided (adresses Michael's comment) 4. Rename the capability type to 'RESOURCE_RESERVE' (addresses Marcel's comment) 5. Remove shpc_present check function (addresses Marcel's comment) 6. Fix the 4th patch message (adresses Michael's comment) 7. Patch for SHPC enabling in _OSC method has been already merged
Changes v3->v4: 1. PCIE-PCI Bridge device: "msi_enable"->"msi", "shpc"->"shpc_bar", remove local_err, make "msi" property OnOffAuto, shpc_present() is still here to avoid SHPC_VMSTATE refactoring (address Marcel's comments). 2. Change QEMU PCI capability layout (SeaBIOS side has the same changes): - change reservation fields types: bus_res - uint32_t, others - uint64_t - rename 'non_pref' and 'pref' fields - interpret -1 value as 'ignore' 3. Use parent_realize in Generic PCI Express Root Port properly. 4. Fix documentation: fully replace the DMI-PCI bridge references with the new PCIE-PCI bridge, "PCIE"->"PCI Express", small mistakes and typos - address Laszlo's and Marcel's comments. 5. Rename QEMU PCI cap creation fucntion - addresses Marcel's comment.
Changes v2->v3: (0). 'do_not_use' capability field flag is still _not_ in here since we haven't come to consesus on it yet. 1. Merge commits 5 (bus_reserve property creation) and 6 (property usage) together - addresses Michael's comment. 2. Add 'bus_reserve' property and QEMU PCI capability only to Generic PCIE Root Port - addresses Michael's and Marcel's comments. 3. Change 'bus_reserve' property's default value to 0 - addresses Michael's comment. 4. Rename QEMU bridge-specific PCI capability creation function - addresses Michael's comment. 5. Init the whole QEMU PCI capability with zeroes - addresses Michael's and Laszlo's comments. 6. Change QEMU PCI capability layout (SeaBIOS side has the same changes) - add 'type' field to distinguish multiple RedHat-specific capabilities - addresses Michael's comment - do not mimiŃ PCI Config space register layout, but use mutually exclusive differently sized fields for IO and prefetchable memory limits - addresses Laszlo's comment 7. Correct error handling in PCIE-PCI bridge realize function. 8. Replace a '2' constant with PCI_CAP_FLAGS in the capability creation function - addresses Michael's comment. 9. Remove a comment on _OSC which isn't correct anymore - address Marcel's comment. 10. Add documentation for the Generic PCIE-PCI Bridge and QEMU PCI capability - addresses Michael's comment.
Changes v1->v2: 1. Enable SHPC for the bridge. 2. Enable SHPC support for the Q35 machine (ACPI stuff). 3. Introduce PCI capability to help firmware on the system init. This allows the bridge to be hotpluggable. Now it's supported only for pcie-root-port. Now it's supposed to used with SeaBIOS only, look at the SeaBIOS corresponding series "Allow RedHat PCI bridges reserve more buses than necessary during init".
Aleksandr Bezzubikov (4): hw/pci: introduce pcie-pci-bridge device hw/pci: introduce bridge-only vendor-specific capability to provide some hints to firmware hw/pci: add QEMU-specific PCI capability to the Generic PCI Express Root Port docs: update documentation considering PCIE-PCI bridge
docs/pcie.txt | 49 +++++----- docs/pcie_pci_bridge.txt | 115 ++++++++++++++++++++++ hw/pci-bridge/Makefile.objs | 2 +- hw/pci-bridge/gen_pcie_root_port.c | 36 +++++++ hw/pci-bridge/pcie_pci_bridge.c | 192 +++++++++++++++++++++++++++++++++++++ hw/pci/pci_bridge.c | 54 +++++++++++ include/hw/pci/pci.h | 1 + include/hw/pci/pci_bridge.h | 24 +++++ include/hw/pci/pcie_port.h | 1 + 9 files changed, 450 insertions(+), 24 deletions(-) create mode 100644 docs/pcie_pci_bridge.txt create mode 100644 hw/pci-bridge/pcie_pci_bridge.c
Introduce a new PCIExpress-to-PCI Bridge device, which is a hot-pluggable PCI Express device and supports devices hot-plug with SHPC.
This device is intended to replace the DMI-to-PCI Bridge.
Signed-off-by: Aleksandr Bezzubikov zuban32s@gmail.com --- hw/pci-bridge/Makefile.objs | 2 +- hw/pci-bridge/pcie_pci_bridge.c | 192 ++++++++++++++++++++++++++++++++++++++++ include/hw/pci/pci.h | 1 + 3 files changed, 194 insertions(+), 1 deletion(-) create mode 100644 hw/pci-bridge/pcie_pci_bridge.c
diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs index c4683cf..666db37 100644 --- a/hw/pci-bridge/Makefile.objs +++ b/hw/pci-bridge/Makefile.objs @@ -1,4 +1,4 @@ -common-obj-y += pci_bridge_dev.o +common-obj-y += pci_bridge_dev.o pcie_pci_bridge.o common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o common-obj-$(CONFIG_PXB) += pci_expander_bridge.o common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o diff --git a/hw/pci-bridge/pcie_pci_bridge.c b/hw/pci-bridge/pcie_pci_bridge.c new file mode 100644 index 0000000..9aa5cc3 --- /dev/null +++ b/hw/pci-bridge/pcie_pci_bridge.c @@ -0,0 +1,192 @@ +/* + * QEMU Generic PCIE-PCI Bridge + * + * Copyright (c) 2017 Aleksandr Bezzubikov + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci_bridge.h" +#include "hw/pci/msi.h" +#include "hw/pci/shpc.h" +#include "hw/pci/slotid_cap.h" + +typedef struct PCIEPCIBridge { + /*< private >*/ + PCIBridge parent_obj; + + OnOffAuto msi; + MemoryRegion shpc_bar; + /*< public >*/ +} PCIEPCIBridge; + +#define TYPE_PCIE_PCI_BRIDGE_DEV "pcie-pci-bridge" +#define PCIE_PCI_BRIDGE_DEV(obj) \ + OBJECT_CHECK(PCIEPCIBridge, (obj), TYPE_PCIE_PCI_BRIDGE_DEV) + +static void pcie_pci_bridge_realize(PCIDevice *d, Error **errp) +{ + PCIBridge *br = PCI_BRIDGE(d); + PCIEPCIBridge *pcie_br = PCIE_PCI_BRIDGE_DEV(d); + int rc, pos; + + pci_bridge_initfn(d, TYPE_PCI_BUS); + + d->config[PCI_INTERRUPT_PIN] = 0x1; + memory_region_init(&pcie_br->shpc_bar, OBJECT(d), "shpc-bar", + shpc_bar_size(d)); + rc = shpc_init(d, &br->sec_bus, &pcie_br->shpc_bar, 0, errp); + if (rc) { + goto error; + } + + rc = pcie_cap_init(d, 0, PCI_EXP_TYPE_PCI_BRIDGE, 0, errp); + if (rc < 0) { + goto cap_error; + } + + pos = pci_add_capability(d, PCI_CAP_ID_PM, 0, PCI_PM_SIZEOF, errp); + if (pos < 0) { + goto pm_error; + } + d->exp.pm_cap = pos; + pci_set_word(d->config + pos + PCI_PM_PMC, 0x3); + + pcie_cap_arifwd_init(d); + pcie_cap_deverr_init(d); + + rc = pcie_aer_init(d, PCI_ERR_VER, 0x100, PCI_ERR_SIZEOF, errp); + if (rc < 0) { + goto aer_error; + } + + if (pcie_br->msi != ON_OFF_AUTO_OFF) { + rc = msi_init(d, 0, 1, true, true, errp); + if (rc < 0) { + goto msi_error; + } + } + pci_register_bar(d, 0, PCI_BASE_ADDRESS_SPACE_MEMORY | + PCI_BASE_ADDRESS_MEM_TYPE_64, &pcie_br->shpc_bar); + return; + +msi_error: + pcie_aer_exit(d); +aer_error: +pm_error: + pcie_cap_exit(d); +cap_error: + shpc_free(d); +error: + pci_bridge_exitfn(d); +} + +static void pcie_pci_bridge_exit(PCIDevice *d) +{ + PCIEPCIBridge *bridge_dev = PCIE_PCI_BRIDGE_DEV(d); + pcie_cap_exit(d); + shpc_cleanup(d, &bridge_dev->shpc_bar); + pci_bridge_exitfn(d); +} + +static void pcie_pci_bridge_reset(DeviceState *qdev) +{ + PCIDevice *d = PCI_DEVICE(qdev); + pci_bridge_reset(qdev); + msi_reset(d); + shpc_reset(d); +} + +static void pcie_pci_bridge_write_config(PCIDevice *d, + uint32_t address, uint32_t val, int len) +{ + pci_bridge_write_config(d, address, val, len); + msi_write_config(d, address, val, len); + shpc_cap_write_config(d, address, val, len); +} + +static Property pcie_pci_bridge_dev_properties[] = { + DEFINE_PROP_ON_OFF_AUTO("msi", PCIEPCIBridge, msi, ON_OFF_AUTO_ON), + DEFINE_PROP_END_OF_LIST(), +}; + +static const VMStateDescription pcie_pci_bridge_dev_vmstate = { + .name = TYPE_PCIE_PCI_BRIDGE_DEV, + .fields = (VMStateField[]) { + VMSTATE_PCI_DEVICE(parent_obj, PCIBridge), + SHPC_VMSTATE(shpc, PCIDevice, NULL), + VMSTATE_END_OF_LIST() + } +}; + +static void pcie_pci_bridge_hotplug_cb(HotplugHandler *hotplug_dev, + DeviceState *dev, Error **errp) +{ + PCIDevice *pci_hotplug_dev = PCI_DEVICE(hotplug_dev); + + if (!shpc_present(pci_hotplug_dev)) { + error_setg(errp, "standard hotplug controller has been disabled for " + "this %s", TYPE_PCIE_PCI_BRIDGE_DEV); + return; + } + shpc_device_hotplug_cb(hotplug_dev, dev, errp); +} + +static void pcie_pci_bridge_hot_unplug_request_cb(HotplugHandler *hotplug_dev, + DeviceState *dev, + Error **errp) +{ + PCIDevice *pci_hotplug_dev = PCI_DEVICE(hotplug_dev); + + if (!shpc_present(pci_hotplug_dev)) { + error_setg(errp, "standard hotplug controller has been disabled for " + "this %s", TYPE_PCIE_PCI_BRIDGE_DEV); + return; + } + shpc_device_hot_unplug_request_cb(hotplug_dev, dev, errp); +} + +static void pcie_pci_bridge_class_init(ObjectClass *klass, void *data) +{ + PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); + DeviceClass *dc = DEVICE_CLASS(klass); + HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(klass); + + k->is_express = 1; + k->is_bridge = 1; + k->vendor_id = PCI_VENDOR_ID_REDHAT; + k->device_id = PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE; + k->realize = pcie_pci_bridge_realize; + k->exit = pcie_pci_bridge_exit; + k->config_write = pcie_pci_bridge_write_config; + dc->vmsd = &pcie_pci_bridge_dev_vmstate; + dc->props = pcie_pci_bridge_dev_properties; + dc->vmsd = &pcie_pci_bridge_dev_vmstate; + dc->reset = &pcie_pci_bridge_reset; + set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); + hc->plug = pcie_pci_bridge_hotplug_cb; + hc->unplug_request = pcie_pci_bridge_hot_unplug_request_cb; +} + +static const TypeInfo pcie_pci_bridge_info = { + .name = TYPE_PCIE_PCI_BRIDGE_DEV, + .parent = TYPE_PCI_BRIDGE, + .instance_size = sizeof(PCIEPCIBridge), + .class_init = pcie_pci_bridge_class_init, + .interfaces = (InterfaceInfo[]) { + { TYPE_HOTPLUG_HANDLER }, + { }, + } +}; + +static void pciepci_register(void) +{ + type_register_static(&pcie_pci_bridge_info); +} + +type_init(pciepci_register); diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index e598b09..b33a34f 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -98,6 +98,7 @@ #define PCI_DEVICE_ID_REDHAT_PXB_PCIE 0x000b #define PCI_DEVICE_ID_REDHAT_PCIE_RP 0x000c #define PCI_DEVICE_ID_REDHAT_XHCI 0x000d +#define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE 0x000e #define PCI_DEVICE_ID_REDHAT_QXL 0x0100
#define FMT_PCIBUS PRIx64
On 11/08/2017 2:31, Aleksandr Bezzubikov wrote:
Introduce a new PCIExpress-to-PCI Bridge device, which is a hot-pluggable PCI Express device and supports devices hot-plug with SHPC.
This device is intended to replace the DMI-to-PCI Bridge.
Signed-off-by: Aleksandr Bezzubikov zuban32s@gmail.com
hw/pci-bridge/Makefile.objs | 2 +- hw/pci-bridge/pcie_pci_bridge.c | 192 ++++++++++++++++++++++++++++++++++++++++ include/hw/pci/pci.h | 1 + 3 files changed, 194 insertions(+), 1 deletion(-) create mode 100644 hw/pci-bridge/pcie_pci_bridge.c
diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs index c4683cf..666db37 100644 --- a/hw/pci-bridge/Makefile.objs +++ b/hw/pci-bridge/Makefile.objs @@ -1,4 +1,4 @@ -common-obj-y += pci_bridge_dev.o +common-obj-y += pci_bridge_dev.o pcie_pci_bridge.o common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o common-obj-$(CONFIG_PXB) += pci_expander_bridge.o common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o diff --git a/hw/pci-bridge/pcie_pci_bridge.c b/hw/pci-bridge/pcie_pci_bridge.c new file mode 100644 index 0000000..9aa5cc3 --- /dev/null +++ b/hw/pci-bridge/pcie_pci_bridge.c @@ -0,0 +1,192 @@ +/*
- QEMU Generic PCIE-PCI Bridge
- Copyright (c) 2017 Aleksandr Bezzubikov
- This work is licensed under the terms of the GNU GPL, version 2 or later.
- See the COPYING file in the top-level directory.
- */
+#include "qemu/osdep.h" +#include "qapi/error.h" +#include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci_bridge.h" +#include "hw/pci/msi.h" +#include "hw/pci/shpc.h" +#include "hw/pci/slotid_cap.h"
+typedef struct PCIEPCIBridge {
- /*< private >*/
- PCIBridge parent_obj;
- OnOffAuto msi;
- MemoryRegion shpc_bar;
- /*< public >*/
+} PCIEPCIBridge;
+#define TYPE_PCIE_PCI_BRIDGE_DEV "pcie-pci-bridge" +#define PCIE_PCI_BRIDGE_DEV(obj) \
OBJECT_CHECK(PCIEPCIBridge, (obj), TYPE_PCIE_PCI_BRIDGE_DEV)
+static void pcie_pci_bridge_realize(PCIDevice *d, Error **errp) +{
- PCIBridge *br = PCI_BRIDGE(d);
- PCIEPCIBridge *pcie_br = PCIE_PCI_BRIDGE_DEV(d);
- int rc, pos;
- pci_bridge_initfn(d, TYPE_PCI_BUS);
- d->config[PCI_INTERRUPT_PIN] = 0x1;
- memory_region_init(&pcie_br->shpc_bar, OBJECT(d), "shpc-bar",
shpc_bar_size(d));
- rc = shpc_init(d, &br->sec_bus, &pcie_br->shpc_bar, 0, errp);
- if (rc) {
goto error;
- }
- rc = pcie_cap_init(d, 0, PCI_EXP_TYPE_PCI_BRIDGE, 0, errp);
- if (rc < 0) {
goto cap_error;
- }
- pos = pci_add_capability(d, PCI_CAP_ID_PM, 0, PCI_PM_SIZEOF, errp);
- if (pos < 0) {
goto pm_error;
- }
- d->exp.pm_cap = pos;
- pci_set_word(d->config + pos + PCI_PM_PMC, 0x3);
- pcie_cap_arifwd_init(d);
- pcie_cap_deverr_init(d);
- rc = pcie_aer_init(d, PCI_ERR_VER, 0x100, PCI_ERR_SIZEOF, errp);
- if (rc < 0) {
goto aer_error;
- }
- if (pcie_br->msi != ON_OFF_AUTO_OFF) {
rc = msi_init(d, 0, 1, true, true, errp);
if (rc < 0) {
goto msi_error;
}
- }
- pci_register_bar(d, 0, PCI_BASE_ADDRESS_SPACE_MEMORY |
PCI_BASE_ADDRESS_MEM_TYPE_64, &pcie_br->shpc_bar);
- return;
+msi_error:
- pcie_aer_exit(d);
+aer_error: +pm_error:
- pcie_cap_exit(d);
+cap_error:
- shpc_free(d);
+error:
- pci_bridge_exitfn(d);
+}
+static void pcie_pci_bridge_exit(PCIDevice *d) +{
- PCIEPCIBridge *bridge_dev = PCIE_PCI_BRIDGE_DEV(d);
- pcie_cap_exit(d);
- shpc_cleanup(d, &bridge_dev->shpc_bar);
- pci_bridge_exitfn(d);
+}
+static void pcie_pci_bridge_reset(DeviceState *qdev) +{
- PCIDevice *d = PCI_DEVICE(qdev);
- pci_bridge_reset(qdev);
- msi_reset(d);
- shpc_reset(d);
+}
+static void pcie_pci_bridge_write_config(PCIDevice *d,
uint32_t address, uint32_t val, int len)
+{
- pci_bridge_write_config(d, address, val, len);
- msi_write_config(d, address, val, len);
- shpc_cap_write_config(d, address, val, len);
+}
+static Property pcie_pci_bridge_dev_properties[] = {
DEFINE_PROP_ON_OFF_AUTO("msi", PCIEPCIBridge, msi, ON_OFF_AUTO_ON),
DEFINE_PROP_END_OF_LIST(),
+};
+static const VMStateDescription pcie_pci_bridge_dev_vmstate = {
.name = TYPE_PCIE_PCI_BRIDGE_DEV,
.fields = (VMStateField[]) {
VMSTATE_PCI_DEVICE(parent_obj, PCIBridge),
SHPC_VMSTATE(shpc, PCIDevice, NULL),
VMSTATE_END_OF_LIST()
}
+};
+static void pcie_pci_bridge_hotplug_cb(HotplugHandler *hotplug_dev,
DeviceState *dev, Error **errp)
+{
- PCIDevice *pci_hotplug_dev = PCI_DEVICE(hotplug_dev);
- if (!shpc_present(pci_hotplug_dev)) {
error_setg(errp, "standard hotplug controller has been disabled for "
"this %s", TYPE_PCIE_PCI_BRIDGE_DEV);
return;
- }
- shpc_device_hotplug_cb(hotplug_dev, dev, errp);
+}
+static void pcie_pci_bridge_hot_unplug_request_cb(HotplugHandler *hotplug_dev,
DeviceState *dev,
Error **errp)
+{
- PCIDevice *pci_hotplug_dev = PCI_DEVICE(hotplug_dev);
- if (!shpc_present(pci_hotplug_dev)) {
error_setg(errp, "standard hotplug controller has been disabled for "
"this %s", TYPE_PCIE_PCI_BRIDGE_DEV);
return;
- }
- shpc_device_hot_unplug_request_cb(hotplug_dev, dev, errp);
+}
+static void pcie_pci_bridge_class_init(ObjectClass *klass, void *data) +{
- PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
- DeviceClass *dc = DEVICE_CLASS(klass);
- HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(klass);
- k->is_express = 1;
- k->is_bridge = 1;
- k->vendor_id = PCI_VENDOR_ID_REDHAT;
- k->device_id = PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE;
- k->realize = pcie_pci_bridge_realize;
- k->exit = pcie_pci_bridge_exit;
- k->config_write = pcie_pci_bridge_write_config;
- dc->vmsd = &pcie_pci_bridge_dev_vmstate;
- dc->props = pcie_pci_bridge_dev_properties;
- dc->vmsd = &pcie_pci_bridge_dev_vmstate;
- dc->reset = &pcie_pci_bridge_reset;
- set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
- hc->plug = pcie_pci_bridge_hotplug_cb;
- hc->unplug_request = pcie_pci_bridge_hot_unplug_request_cb;
+}
+static const TypeInfo pcie_pci_bridge_info = {
.name = TYPE_PCIE_PCI_BRIDGE_DEV,
.parent = TYPE_PCI_BRIDGE,
.instance_size = sizeof(PCIEPCIBridge),
.class_init = pcie_pci_bridge_class_init,
.interfaces = (InterfaceInfo[]) {
{ TYPE_HOTPLUG_HANDLER },
{ },
}
+};
+static void pciepci_register(void) +{
- type_register_static(&pcie_pci_bridge_info);
+}
+type_init(pciepci_register); diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index e598b09..b33a34f 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -98,6 +98,7 @@ #define PCI_DEVICE_ID_REDHAT_PXB_PCIE 0x000b #define PCI_DEVICE_ID_REDHAT_PCIE_RP 0x000c #define PCI_DEVICE_ID_REDHAT_XHCI 0x000d +#define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE 0x000e #define PCI_DEVICE_ID_REDHAT_QXL 0x0100
#define FMT_PCIBUS PRIx64
Reviewed-by: Marcel Apfelbaum marcel@redhat.com
Thanks, Marcel
On PCI init PCI bridges may need some extra info about bus number, IO, memory and prefetchable memory to reserve. QEMU can provide this with a special vendor-specific PCI capability.
Signed-off-by: Aleksandr Bezzubikov zuban32s@gmail.com Reviewed-by: Marcel Apfelbaum marcel@redhat.com --- hw/pci/pci_bridge.c | 54 +++++++++++++++++++++++++++++++++++++++++++++ include/hw/pci/pci_bridge.h | 24 ++++++++++++++++++++ 2 files changed, 78 insertions(+)
diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index 720119b..2495a51 100644 --- a/hw/pci/pci_bridge.c +++ b/hw/pci/pci_bridge.c @@ -408,6 +408,60 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, br->bus_name = bus_name; }
+ +int pci_bridge_qemu_reserve_cap_init(PCIDevice *dev, int cap_offset, + uint32_t bus_reserve, uint64_t io_reserve, + uint32_t mem_non_pref_reserve, + uint32_t mem_pref_32_reserve, + uint64_t mem_pref_64_reserve, + Error **errp) +{ + if (mem_pref_32_reserve != (uint32_t)-1 && + mem_pref_64_reserve != (uint64_t) -1) { + error_setg(errp, + "PCI resource reserve cap: PREF32 and PREF64 conflict"); + return -EINVAL; + } + + if (bus_reserve == (uint32_t)-1 && + io_reserve == (uint64_t)-1 && + mem_non_pref_reserve == (uint32_t)-1 && + mem_pref_32_reserve == (uint32_t)-1 && + mem_pref_64_reserve == (uint64_t)-1) { + return 0; + } + + size_t cap_len = sizeof(PCIBridgeQemuCap); + PCIBridgeQemuCap cap = { + .len = cap_len, + .type = REDHAT_PCI_CAP_RESOURCE_RESERVE, + .bus_res = bus_reserve, + .io = io_reserve, + .mem = mem_non_pref_reserve, + .mem_pref_32 = (uint32_t)-1, + .mem_pref_64 = (uint64_t)-1 + }; + + if (mem_pref_32_reserve != (uint32_t)-1 && + mem_pref_64_reserve == (uint64_t)-1) { + cap.mem_pref_32 = mem_pref_32_reserve; + } else if (mem_pref_32_reserve == (uint32_t)-1 && + mem_pref_64_reserve != (uint64_t)-1) { + cap.mem_pref_64 = mem_pref_64_reserve; + } + + int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR, + cap_offset, cap_len, errp); + if (offset < 0) { + return offset; + } + + memcpy(dev->config + offset + PCI_CAP_FLAGS, + (char *)&cap + PCI_CAP_FLAGS, + cap_len - PCI_CAP_FLAGS); + return 0; +} + static const TypeInfo pci_bridge_type_info = { .name = TYPE_PCI_BRIDGE, .parent = TYPE_PCI_DEVICE, diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h index ff7cbaa..2d8c635 100644 --- a/include/hw/pci/pci_bridge.h +++ b/include/hw/pci/pci_bridge.h @@ -67,4 +67,28 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, #define PCI_BRIDGE_CTL_DISCARD_STATUS 0x400 /* Discard timer status */ #define PCI_BRIDGE_CTL_DISCARD_SERR 0x800 /* Discard timer SERR# enable */
+typedef struct PCIBridgeQemuCap { + uint8_t id; /* Standard PCI capability header field */ + uint8_t next; /* Standard PCI capability header field */ + uint8_t len; /* Standard PCI vendor-specific capability header field */ + uint8_t type; /* Red Hat vendor-specific capability type. + Types are defined with REDHAT_PCI_CAP_ prefix */ + + uint32_t bus_res; /* Minimum number of buses to reserve */ + uint64_t io; /* IO space to reserve */ + uint32_t mem; /* Non-prefetchable memory to reserve */ + /* This two fields are mutually exclusive */ + uint32_t mem_pref_32; /* Prefetchable memory to reserve (32-bit MMIO) */ + uint64_t mem_pref_64; /* Prefetchable memory to reserve (64-bit MMIO) */ +} PCIBridgeQemuCap; + +#define REDHAT_PCI_CAP_RESOURCE_RESERVE 1 + +int pci_bridge_qemu_reserve_cap_init(PCIDevice *dev, int cap_offset, + uint32_t bus_reserve, uint64_t io_reserve, + uint32_t mem_non_pref_reserve, + uint32_t mem_pref_32_reserve, + uint64_t mem_pref_64_reserve, + Error **errp); + #endif /* QEMU_PCI_BRIDGE_H */
On 11/08/2017 2:31, Aleksandr Bezzubikov wrote:
On PCI init PCI bridges may need some extra info about bus number, IO, memory and prefetchable memory to reserve. QEMU can provide this with a special vendor-specific PCI capability.
Hi Aleksandr,
I only have a few very small comments, other than that it looks OK to me.
Signed-off-by: Aleksandr Bezzubikov zuban32s@gmail.com Reviewed-by: Marcel Apfelbaum marcel@redhat.com
hw/pci/pci_bridge.c | 54 +++++++++++++++++++++++++++++++++++++++++++++ include/hw/pci/pci_bridge.h | 24 ++++++++++++++++++++ 2 files changed, 78 insertions(+)
diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index 720119b..2495a51 100644 --- a/hw/pci/pci_bridge.c +++ b/hw/pci/pci_bridge.c @@ -408,6 +408,60 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, br->bus_name = bus_name; }
+int pci_bridge_qemu_reserve_cap_init(PCIDevice *dev, int cap_offset, > + uint32_t bus_reserve, uint64_t io_reserve,
Please pay attention to indentation, the above line should be aligned with the above ("
uint32_t mem_non_pref_reserve,
uint32_t mem_pref_32_reserve,
uint64_t mem_pref_64_reserve,
Error **errp)
+{
- if (mem_pref_32_reserve != (uint32_t)-1 &&
mem_pref_64_reserve != (uint64_t) -1) {
Same here
error_setg(errp,
"PCI resource reserve cap: PREF32 and PREF64 conflict");
return -EINVAL;
- }
- if (bus_reserve == (uint32_t)-1 &&
io_reserve == (uint64_t)-1 &&
mem_non_pref_reserve == (uint32_t)-1 &&
mem_pref_32_reserve == (uint32_t)-1 &&
mem_pref_64_reserve == (uint64_t)-1) {
and here (please go over all the file)
return 0;
- }
- size_t cap_len = sizeof(PCIBridgeQemuCap);
- PCIBridgeQemuCap cap = {
.len = cap_len,
.type = REDHAT_PCI_CAP_RESOURCE_RESERVE,
.bus_res = bus_reserve,
.io = io_reserve,
.mem = mem_non_pref_reserve,
.mem_pref_32 = (uint32_t)-1,
.mem_pref_64 = (uint64_t)-1
Why not use the values of mem_pref_32_reserve and mem_pref_64_reserve ? You already have checked they are mutually exclusive.
- };
- if (mem_pref_32_reserve != (uint32_t)-1 &&
mem_pref_64_reserve == (uint64_t)-1) {
cap.mem_pref_32 = mem_pref_32_reserve;
- } else if (mem_pref_32_reserve == (uint32_t)-1 &&
mem_pref_64_reserve != (uint64_t)-1) {
cap.mem_pref_64 = mem_pref_64_reserve;
- }
So it seems you don't need the above code at all, right?
With the above minor comments, please keep my R-b tag. Thanks, Marcel
- int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR,
cap_offset, cap_len, errp);
- if (offset < 0) {
return offset;
- }
- memcpy(dev->config + offset + PCI_CAP_FLAGS,
(char *)&cap + PCI_CAP_FLAGS,
cap_len - PCI_CAP_FLAGS);
- return 0;
+}
- static const TypeInfo pci_bridge_type_info = { .name = TYPE_PCI_BRIDGE, .parent = TYPE_PCI_DEVICE,
diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h index ff7cbaa..2d8c635 100644 --- a/include/hw/pci/pci_bridge.h +++ b/include/hw/pci/pci_bridge.h @@ -67,4 +67,28 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name, #define PCI_BRIDGE_CTL_DISCARD_STATUS 0x400 /* Discard timer status */ #define PCI_BRIDGE_CTL_DISCARD_SERR 0x800 /* Discard timer SERR# enable */
+typedef struct PCIBridgeQemuCap {
- uint8_t id; /* Standard PCI capability header field */
- uint8_t next; /* Standard PCI capability header field */
- uint8_t len; /* Standard PCI vendor-specific capability header field */
- uint8_t type; /* Red Hat vendor-specific capability type.
Types are defined with REDHAT_PCI_CAP_ prefix */
- uint32_t bus_res; /* Minimum number of buses to reserve */
- uint64_t io; /* IO space to reserve */
- uint32_t mem; /* Non-prefetchable memory to reserve */
- /* This two fields are mutually exclusive */
- uint32_t mem_pref_32; /* Prefetchable memory to reserve (32-bit MMIO) */
- uint64_t mem_pref_64; /* Prefetchable memory to reserve (64-bit MMIO) */
+} PCIBridgeQemuCap;
+#define REDHAT_PCI_CAP_RESOURCE_RESERVE 1
+int pci_bridge_qemu_reserve_cap_init(PCIDevice *dev, int cap_offset,
uint32_t bus_reserve, uint64_t io_reserve,
uint32_t mem_non_pref_reserve,
uint32_t mem_pref_32_reserve,
uint64_t mem_pref_64_reserve,
Error **errp);
- #endif /* QEMU_PCI_BRIDGE_H */
To enable hotplugging of a newly created pcie-pci-bridge, we need to tell firmware (e.g. SeaBIOS) to reserve additional buses or IO/MEM/PREF space for pcie-root-port. Additional bus reservation allows us to hotplug pcie-pci-bridge into this root port. The number of buses and IO/MEM/PREF space to reserve are provided to the device via a corresponding property, and to the firmware via new PCI capability. The properties' default values are -1 to keep default behavior unchanged.
Signed-off-by: Aleksandr Bezzubikov zuban32s@gmail.com Reviewed-by: Marcel Apfelbaum marcel@redhat.com --- hw/pci-bridge/gen_pcie_root_port.c | 36 ++++++++++++++++++++++++++++++++++++ include/hw/pci/pcie_port.h | 1 + 2 files changed, 37 insertions(+)
diff --git a/hw/pci-bridge/gen_pcie_root_port.c b/hw/pci-bridge/gen_pcie_root_port.c index cb694d6..bd65479 100644 --- a/hw/pci-bridge/gen_pcie_root_port.c +++ b/hw/pci-bridge/gen_pcie_root_port.c @@ -16,6 +16,8 @@ #include "hw/pci/pcie_port.h"
#define TYPE_GEN_PCIE_ROOT_PORT "pcie-root-port" +#define GEN_PCIE_ROOT_PORT(obj) \ + OBJECT_CHECK(GenPCIERootPort, (obj), TYPE_GEN_PCIE_ROOT_PORT)
#define GEN_PCIE_ROOT_PORT_AER_OFFSET 0x100 #define GEN_PCIE_ROOT_PORT_MSIX_NR_VECTOR 1 @@ -26,6 +28,13 @@ typedef struct GenPCIERootPort { /*< public >*/
bool migrate_msix; + + /* additional resources to reserve on firmware init */ + uint32_t bus_reserve; + uint64_t io_reserve; + uint32_t mem_reserve; + uint32_t pref32_reserve; + uint64_t pref64_reserve; } GenPCIERootPort;
static uint8_t gen_rp_aer_vector(const PCIDevice *d) @@ -60,6 +69,24 @@ static bool gen_rp_test_migrate_msix(void *opaque, int version_id) return rp->migrate_msix; }
+static void gen_rp_realize(DeviceState *dev, Error **errp) +{ + PCIDevice *d = PCI_DEVICE(dev); + GenPCIERootPort *grp = GEN_PCIE_ROOT_PORT(d); + PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(d); + + rpc->parent_realize(dev, errp); + + int rc = pci_bridge_qemu_reserve_cap_init(d, 0, grp->bus_reserve, + grp->io_reserve, grp->mem_reserve, grp->pref32_reserve, + grp->pref64_reserve, errp); + + if (rc < 0) { + rpc->parent_class.exit(d); + return; + } +} + static const VMStateDescription vmstate_rp_dev = { .name = "pcie-root-port", .version_id = 1, @@ -78,6 +105,11 @@ static const VMStateDescription vmstate_rp_dev = {
static Property gen_rp_props[] = { DEFINE_PROP_BOOL("x-migrate-msix", GenPCIERootPort, migrate_msix, true), + DEFINE_PROP_UINT32("bus-reserve", GenPCIERootPort, bus_reserve, -1), + DEFINE_PROP_UINT64("io-reserve", GenPCIERootPort, io_reserve, -1), + DEFINE_PROP_UINT32("mem-reserve", GenPCIERootPort, mem_reserve, -1), + DEFINE_PROP_UINT32("pref32-reserve", GenPCIERootPort, pref32_reserve, -1), + DEFINE_PROP_UINT64("pref64-reserve", GenPCIERootPort, pref64_reserve, -1), DEFINE_PROP_END_OF_LIST() };
@@ -92,6 +124,10 @@ static void gen_rp_dev_class_init(ObjectClass *klass, void *data) dc->desc = "PCI Express Root Port"; dc->vmsd = &vmstate_rp_dev; dc->props = gen_rp_props; + + rpc->parent_realize = dc->realize; + dc->realize = gen_rp_realize; + rpc->aer_vector = gen_rp_aer_vector; rpc->interrupts_init = gen_rp_interrupts_init; rpc->interrupts_uninit = gen_rp_interrupts_uninit; diff --git a/include/hw/pci/pcie_port.h b/include/hw/pci/pcie_port.h index 1333266..0736014 100644 --- a/include/hw/pci/pcie_port.h +++ b/include/hw/pci/pcie_port.h @@ -65,6 +65,7 @@ void pcie_chassis_del_slot(PCIESlot *s);
typedef struct PCIERootPortClass { PCIDeviceClass parent_class; + DeviceRealize parent_realize;
uint8_t (*aer_vector)(const PCIDevice *dev); int (*interrupts_init)(PCIDevice *dev, Error **errp);
Signed-off-by: Aleksandr Bezzubikov zuban32s@gmail.com --- docs/pcie.txt | 49 ++++++++++---------- docs/pcie_pci_bridge.txt | 115 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 141 insertions(+), 23 deletions(-) create mode 100644 docs/pcie_pci_bridge.txt
diff --git a/docs/pcie.txt b/docs/pcie.txt index 5bada24..76b85ec 100644 --- a/docs/pcie.txt +++ b/docs/pcie.txt @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on the Root Complex: (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI Express hierarchies.
- (3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI + (3) PCI Express to PCI Bridge (pcie-pci-bridge), for starting legacy PCI hierarchies.
(4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root Buses @@ -55,18 +55,18 @@ Place only the following kinds of devices directly on the Root Complex: pcie.0 bus ---------------------------------------------------------------------------- | | | | - ----------- ------------------ ------------------ -------------- - | PCI Dev | | PCIe Root Port | | DMI-PCI Bridge | | pxb-pcie | - ----------- ------------------ ------------------ -------------- + ----------- ------------------ ------------------- -------------- + | PCI Dev | | PCIe Root Port | | PCIe-PCI Bridge | | pxb-pcie | + ----------- ------------------ ------------------- --------------
2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint use: -device <dev>[,bus=pcie.0] 2.1.2 To expose a new PCI Express Root Bus use: -device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z] - Only PCI Express Root Ports and DMI-PCI bridges can be connected - to the pcie.1 bus: + PCI Express Root Ports and PCI Express to PCI bridges can be + connected to the pcie.1 bus: -device ioh3420,id=root_port1[,bus=pcie.1][,chassis=x][,slot=y][,addr=z] \ - -device i82801b11-bridge,id=dmi_pci_bridge1,bus=pcie.1 + -device pcie-pci-bridge,id=pcie_pci_bridge1,bus=pcie.1
2.2 PCI Express only hierarchy @@ -130,24 +130,24 @@ Notes: Legacy PCI devices can be plugged into pcie.0 as Integrated Endpoints, but, as mentioned in section 5, doing so means the legacy PCI device in question will be incapable of hot-unplugging. -Besides that use DMI-PCI Bridges (i82801b11-bridge) in combination -with PCI-PCI Bridges (pci-bridge) to start PCI hierarchies. +Besides that use PCI Express to PCI Bridges (pcie-pci-bridge) in +combination with PCI-PCI Bridges (pci-bridge) to start PCI hierarchies.
-Prefer flat hierarchies. For most scenarios a single DMI-PCI Bridge +Prefer flat hierarchies. For most scenarios a single PCI Express to PCI Bridge (having 32 slots) and several PCI-PCI Bridges attached to it (each supporting also 32 slots) will support hundreds of legacy devices. -The recommendation is to populate one PCI-PCI Bridge under the DMI-PCI Bridge -until is full and then plug a new PCI-PCI Bridge... +The recommendation is to populate one PCI-PCI Bridge under the +PCI Express to PCI Bridge until is full and then plug a new PCI-PCI Bridge...
pcie.0 bus ---------------------------------------------- | | - ----------- ------------------ - | PCI Dev | | DMI-PCI BRIDGE | - ---------- ------------------ + ----------- ------------------- + | PCI Dev | | PCIe-PCI Bridge | + ----------- ------------------- | | ------------------ ------------------ - | PCI-PCI Bridge | | PCI-PCI Bridge | ... + | PCI-PCI Bridge | | PCI-PCI Bridge | ------------------ ------------------ | | ----------- ----------- @@ -157,11 +157,11 @@ until is full and then plug a new PCI-PCI Bridge... 2.3.1 To plug a PCI device into pcie.0 as an Integrated Endpoint use: -device <dev>[,bus=pcie.0] 2.3.2 Plugging a PCI device into a PCI-PCI Bridge: - -device i82801b11-bridge,id=dmi_pci_bridge1[,bus=pcie.0] \ - -device pci-bridge,id=pci_bridge1,bus=dmi_pci_bridge1[,chassis_nr=x][,addr=y] \ + -device pcie-pci-bridge,id=pcie_pci_bridge1[,bus=pcie.0] \ + -device pci-bridge,id=pci_bridge1,bus=pcie_pci_bridge1[,chassis_nr=x][,addr=y] \ -device <dev>,bus=pci_bridge1[,addr=x] Note that 'addr' cannot be 0 unless shpc=off parameter is passed to - the PCI Bridge. + the PCI Bridge/PCI Express to PCI Bridge.
3. IO space issues =================== @@ -219,14 +219,16 @@ do not support hot-plug, so any devices plugged into Root Complexes cannot be hot-plugged/hot-unplugged: (1) PCI Express Integrated Endpoints (2) PCI Express Root Ports - (3) DMI-PCI Bridges + (3) PCI Express to PCI Bridges (4) pxb-pcie
Be aware that PCI Express Downstream Ports can't be hot-plugged into an existing PCI Express Upstream Port.
-PCI devices can be hot-plugged into PCI-PCI Bridges. The PCI hot-plug is ACPI -based and can work side by side with the PCI Express native hot-plug. +PCI devices can be hot-plugged into PCI Express to PCI and PCI-PCI Bridges. +The PCI hot-plug into PCI-PCI bridge is ACPI based, whereas hot-plug into +PCI Express to PCI bridges is SHPC-based. They both can work side by side with +the PCI Express native hot-plug.
PCI Express devices can be natively hot-plugged/hot-unplugged into/from PCI Express Root Ports (and PCI Express Downstream Ports). @@ -234,10 +236,11 @@ PCI Express Root Ports (and PCI Express Downstream Ports). 5.1 Planning for hot-plug: (1) PCI hierarchy Leave enough PCI-PCI Bridge slots empty or add one - or more empty PCI-PCI Bridges to the DMI-PCI Bridge. + or more empty PCI-PCI Bridges to the PCI Express to PCI Bridge.
For each such PCI-PCI Bridge the Guest Firmware is expected to reserve 4K IO space and 2M MMIO range to be used for all devices behind it. + Appropriate PCI capability is designed, see pcie_pci_bridge.txt.
Because of the hard IO limit of around 10 PCI Bridges (~ 40K space) per system don't use more than 9 PCI-PCI Bridges, leaving 4K for the diff --git a/docs/pcie_pci_bridge.txt b/docs/pcie_pci_bridge.txt new file mode 100644 index 0000000..eabac32 --- /dev/null +++ b/docs/pcie_pci_bridge.txt @@ -0,0 +1,115 @@ +Generic PCI Express to PCI Bridge +================================ + +Description +=========== +PCIE-to-PCI bridge is a new method for legacy PCI +hierarchies creation on Q35 machines. + +Previously Intel DMI-to-PCI bridge was used for this purpose. +But due to its strict limitations - no support of hot-plug, +no cross-platform and cross-architecture support - a new generic +PCIE-to-PCI bridge should now be used for any legacy PCI device usage +with PCI Express machine. + +This generic PCIE-PCI bridge is a cross-platform device, +can be hot-plugged into appropriate root port (requires additional actions, +see 'PCIE-PCI bridge hot-plug' section), +and supports devices hot-plug into the bridge itself +(with some limitations, see below). + +Hot-plug of legacy PCI devices into the bridge +is provided by bridge's built-in Standard hot-plug Controller. +Though it still has some limitations, see below. + +PCIE-PCI bridge hot-plug +======================= +Guest OSes require extra efforts to enable PCIE-PCI bridge hot-plug. +Motivation - now on init any PCI Express root port which doesn't have +any device plugged in, has no free buses reserved to provide any of them +to a hot-plugged devices in future. + +To solve this problem we reserve additional buses on a firmware level. +Currently only SeaBIOS is supported. +The way of bus number to reserve delivery is special +Red Hat vendor-specific PCI capability, added to the root port +that is planned to have PCIE-PCI bridge hot-plugged in. + +Capability layout (defined in include/hw/pci/pci_bridge.h): + + uint8_t id; Standard PCI capability header field + uint8_t next; Standard PCI capability header field + uint8_t len; Standard PCI vendor-specific capability header field + + uint8_t type; Red Hat vendor-specific capability type + List of currently existing types: + RESOURCE_RESERVE = 1 + + + uint32_t bus_res; Minimum number of buses to reserve + + uint64_t io; IO space to reserve + uint32_t mem Non-prefetchable memory to reserve + + This two fields are mutually exclusive: + uint32_t mem_pref_32; Prefetchable memory to reserve (32-bit MMIO) + uint64_t mem_pref_64; Prefetchable memory to reserve (64-bit MMIO) + +If any reservation field is -1 then this kind of reservation is not +needed and must be ignored by firmware. + +mem_pref_* fields mutual exclusiveness means they cannot be -1 both. + +At the moment this capability is used only in QEMU generic PCIe root port +(-device pcie-root-port). Capability construction function takes all reservation +fields values from corresponding device properties. By default all of them are +set to -1 to leave root port's default behavior unchanged. + +Usage +===== +A detailed command line would be: + +[qemu-bin + storage options] \ +-m 2G \ +-device ioh3420,bus=pcie.0,id=rp1 \ +-device ioh3420,bus=pcie.0,id=rp2 \ +-device pcie-root-port,bus=pcie.0,id=rp3,bus-reserve=1 \ +-device pcie-pci-bridge,id=br1,bus=rp1 \ +-device pcie-pci-bridge,id=br2,bus=rp2 \ +-device e1000,bus=br1,addr=8 + +Then in monitor it's OK to execute next commands: +device_add pcie-pci-bridge,id=br3,bus=rp3 +device_add e1000,bus=br2,addr=1 +device_add e1000,bus=br3,addr=1 + +Here you have: + (1) Cold-plugged: + - Root ports: 1 QEMU generic root port with the capability mentioned above, + 2 ioh3420 root ports; + - 2 PCIE-PCI bridges plugged into 2 different root ports; + - e1000 plugged into the first bridge. + (2) Hot-plugged: + - PCIE-PCI bridge, plugged into QEMU generic root port; + - 2 e1000 cards, one plugged into the cold-plugged PCIE-PCI bridge, + another plugged into the hot-plugged bridge. + +Limitations +=========== +The PCIE-PCI bridge can be hot-plugged only into pcie-root-port that +has proper 'bus-reserve' property value to provide secondary bus for the +hot-plugged bridge. + +Windows 7 and older versions don't support hot-plug devices into the PCIE-PCI bridge. +To enable device hot-plug into the bridge on Linux there're 3 ways: +1) Build shpchp module with this patch http://www.spinics.net/lists/linux-pci/msg63052.html +2) Use kernel 4.14+ where the patch mentioned above is already merged. +3) Set 'msi' property to off - this forced the bridge to use legacy INTx, + which allows the bridge to notify the OS about hot-plug event without having + BUSMASTER set. + +Implementation +============== +The PCIE-PCI bridge is based on PCI-PCI bridge, but also accumulates PCI Express +features as a PCI Express device (is_express=1). +
On 08/11/17 01:31, Aleksandr Bezzubikov wrote:
+PCIE-PCI bridge hot-plug +======================= +Guest OSes require extra efforts to enable PCIE-PCI bridge hot-plug. +Motivation - now on init any PCI Express root port which doesn't have +any device plugged in, has no free buses reserved to provide any of them +to a hot-plugged devices in future.
+To solve this problem we reserve additional buses on a firmware level. +Currently only SeaBIOS is supported. +The way of bus number to reserve delivery is special +Red Hat vendor-specific PCI capability, added to the root port +that is planned to have PCIE-PCI bridge hot-plugged in.
+Capability layout (defined in include/hw/pci/pci_bridge.h):
- uint8_t id; Standard PCI capability header field
- uint8_t next; Standard PCI capability header field
- uint8_t len; Standard PCI vendor-specific capability header field
- uint8_t type; Red Hat vendor-specific capability type
List of currently existing types:
RESOURCE_RESERVE = 1
- uint32_t bus_res; Minimum number of buses to reserve
- uint64_t io; IO space to reserve
- uint32_t mem Non-prefetchable memory to reserve
- This two fields are mutually exclusive:
[*] mark this
- uint32_t mem_pref_32; Prefetchable memory to reserve (32-bit MMIO)
- uint64_t mem_pref_64; Prefetchable memory to reserve (64-bit MMIO)
+If any reservation field is -1 then this kind of reservation is not +needed and must be ignored by firmware.
+mem_pref_* fields mutual exclusiveness means they cannot be -1 both.
Please drop the last sentence; it is perfectly possible that a bridge doesn't need either 32-bit or 64-bit prefetchable MMIO reservation. "Mutually exclusive" usually means "at most one", not "exactly one". (E.g., think of the "mutex" construct -- in the critical section being protected by the mutex, there can be Thread 1, Thread 2, or none of them.)
So, beyond dropping the last sentence, I suggest to replace the one marked with [*] with the following, for clarity:
At most one of the following two fields may be set to a value different from -1:
With this update, for this patch:
Reviewed-by: Laszlo Ersek lersek@redhat.com
Thanks! Laszlo
On 11/08/2017 2:31, Aleksandr Bezzubikov wrote:
Signed-off-by: Aleksandr Bezzubikov zuban32s@gmail.com
docs/pcie.txt | 49 ++++++++++---------- docs/pcie_pci_bridge.txt | 115 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 141 insertions(+), 23 deletions(-) create mode 100644 docs/pcie_pci_bridge.txt
diff --git a/docs/pcie.txt b/docs/pcie.txt index 5bada24..76b85ec 100644 --- a/docs/pcie.txt +++ b/docs/pcie.txt @@ -46,7 +46,7 @@ Place only the following kinds of devices directly on the Root Complex: (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI Express hierarchies.
- (3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI
(3) PCI Express to PCI Bridge (pcie-pci-bridge), for starting legacy PCI hierarchies.
(4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root Buses
@@ -55,18 +55,18 @@ Place only the following kinds of devices directly on the Root Complex: pcie.0 bus ---------------------------------------------------------------------------- | | | |
- | PCI Dev | | PCIe Root Port | | DMI-PCI Bridge | | pxb-pcie |
| PCI Dev | | PCIe Root Port | | PCIe-PCI Bridge | | pxb-pcie |
2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint use: -device <dev>[,bus=pcie.0] 2.1.2 To expose a new PCI Express Root Bus use: -device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z]
Only PCI Express Root Ports and DMI-PCI bridges can be connected
to the pcie.1 bus:
PCI Express Root Ports and PCI Express to PCI bridges can be
connected to the pcie.1 bus: -device ioh3420,id=root_port1[,bus=pcie.1][,chassis=x][,slot=y][,addr=z] \
-device i82801b11-bridge,id=dmi_pci_bridge1,bus=pcie.1
-device pcie-pci-bridge,id=pcie_pci_bridge1,bus=pcie.1
2.2 PCI Express only hierarchy
@@ -130,24 +130,24 @@ Notes: Legacy PCI devices can be plugged into pcie.0 as Integrated Endpoints, but, as mentioned in section 5, doing so means the legacy PCI device in question will be incapable of hot-unplugging. -Besides that use DMI-PCI Bridges (i82801b11-bridge) in combination -with PCI-PCI Bridges (pci-bridge) to start PCI hierarchies. +Besides that use PCI Express to PCI Bridges (pcie-pci-bridge) in +combination with PCI-PCI Bridges (pci-bridge) to start PCI hierarchies.
-Prefer flat hierarchies. For most scenarios a single DMI-PCI Bridge +Prefer flat hierarchies. For most scenarios a single PCI Express to PCI Bridge (having 32 slots) and several PCI-PCI Bridges attached to it (each supporting also 32 slots) will support hundreds of legacy devices. -The recommendation is to populate one PCI-PCI Bridge under the DMI-PCI Bridge -until is full and then plug a new PCI-PCI Bridge... +The recommendation is to populate one PCI-PCI Bridge under the +PCI Express to PCI Bridge until is full and then plug a new PCI-PCI Bridge...
pcie.0 bus ---------------------------------------------- | |
- | PCI Dev | | DMI-PCI BRIDGE |
- | PCI Dev | | PCIe-PCI Bridge |
| | ------------------ ------------------
| PCI-PCI Bridge | | PCI-PCI Bridge | ...
| PCI-PCI Bridge | | PCI-PCI Bridge | ------------------ ------------------ | | ----------- -----------
@@ -157,11 +157,11 @@ until is full and then plug a new PCI-PCI Bridge... 2.3.1 To plug a PCI device into pcie.0 as an Integrated Endpoint use: -device <dev>[,bus=pcie.0] 2.3.2 Plugging a PCI device into a PCI-PCI Bridge:
-device i82801b11-bridge,id=dmi_pci_bridge1[,bus=pcie.0] \
-device pci-bridge,id=pci_bridge1,bus=dmi_pci_bridge1[,chassis_nr=x][,addr=y] \
-device pcie-pci-bridge,id=pcie_pci_bridge1[,bus=pcie.0] \
-device pci-bridge,id=pci_bridge1,bus=pcie_pci_bridge1[,chassis_nr=x][,addr=y] \ -device <dev>,bus=pci_bridge1[,addr=x] Note that 'addr' cannot be 0 unless shpc=off parameter is passed to
the PCI Bridge.
the PCI Bridge/PCI Express to PCI Bridge.
- IO space issues
===================
@@ -219,14 +219,16 @@ do not support hot-plug, so any devices plugged into Root Complexes cannot be hot-plugged/hot-unplugged: (1) PCI Express Integrated Endpoints (2) PCI Express Root Ports
- (3) DMI-PCI Bridges
(3) PCI Express to PCI Bridges (4) pxb-pcie
Be aware that PCI Express Downstream Ports can't be hot-plugged into an existing PCI Express Upstream Port.
-PCI devices can be hot-plugged into PCI-PCI Bridges. The PCI hot-plug is ACPI -based and can work side by side with the PCI Express native hot-plug. +PCI devices can be hot-plugged into PCI Express to PCI and PCI-PCI Bridges. +The PCI hot-plug into PCI-PCI bridge is ACPI based, whereas hot-plug into +PCI Express to PCI bridges is SHPC-based. They both can work side by side with +the PCI Express native hot-plug.
PCI Express devices can be natively hot-plugged/hot-unplugged into/from PCI Express Root Ports (and PCI Express Downstream Ports). @@ -234,10 +236,11 @@ PCI Express Root Ports (and PCI Express Downstream Ports). 5.1 Planning for hot-plug: (1) PCI hierarchy Leave enough PCI-PCI Bridge slots empty or add one
or more empty PCI-PCI Bridges to the DMI-PCI Bridge.
or more empty PCI-PCI Bridges to the PCI Express to PCI Bridge. For each such PCI-PCI Bridge the Guest Firmware is expected to reserve 4K IO space and 2M MMIO range to be used for all devices behind it.
Appropriate PCI capability is designed, see pcie_pci_bridge.txt. Because of the hard IO limit of around 10 PCI Bridges (~ 40K space) per system don't use more than 9 PCI-PCI Bridges, leaving 4K for the
diff --git a/docs/pcie_pci_bridge.txt b/docs/pcie_pci_bridge.txt new file mode 100644 index 0000000..eabac32 --- /dev/null +++ b/docs/pcie_pci_bridge.txt @@ -0,0 +1,115 @@ +Generic PCI Express to PCI Bridge +================================
+Description +=========== +PCIE-to-PCI bridge is a new method for legacy PCI +hierarchies creation on Q35 machines.
+Previously Intel DMI-to-PCI bridge was used for this purpose. +But due to its strict limitations - no support of hot-plug, +no cross-platform and cross-architecture support - a new generic +PCIE-to-PCI bridge should now be used for any legacy PCI device usage +with PCI Express machine.
+This generic PCIE-PCI bridge is a cross-platform device, +can be hot-plugged into appropriate root port (requires additional actions, +see 'PCIE-PCI bridge hot-plug' section), +and supports devices hot-plug into the bridge itself +(with some limitations, see below).
+Hot-plug of legacy PCI devices into the bridge +is provided by bridge's built-in Standard hot-plug Controller. +Though it still has some limitations, see below.
+PCIE-PCI bridge hot-plug +======================= +Guest OSes require extra efforts to enable PCIE-PCI bridge hot-plug. +Motivation - now on init any PCI Express root port which doesn't have +any device plugged in, has no free buses reserved to provide any of them +to a hot-plugged devices in future.
+To solve this problem we reserve additional buses on a firmware level. +Currently only SeaBIOS is supported. +The way of bus number to reserve delivery is special +Red Hat vendor-specific PCI capability, added to the root port +that is planned to have PCIE-PCI bridge hot-plugged in.
+Capability layout (defined in include/hw/pci/pci_bridge.h):
- uint8_t id; Standard PCI capability header field
- uint8_t next; Standard PCI capability header field
- uint8_t len; Standard PCI vendor-specific capability header field
- uint8_t type; Red Hat vendor-specific capability type
List of currently existing types:
RESOURCE_RESERVE = 1
- uint32_t bus_res; Minimum number of buses to reserve
- uint64_t io; IO space to reserve
- uint32_t mem Non-prefetchable memory to reserve
- This two fields are mutually exclusive:
- uint32_t mem_pref_32; Prefetchable memory to reserve (32-bit MMIO)
- uint64_t mem_pref_64; Prefetchable memory to reserve (64-bit MMIO)
+If any reservation field is -1 then this kind of reservation is not +needed and must be ignored by firmware.
+mem_pref_* fields mutual exclusiveness means they cannot be -1 both.
+At the moment this capability is used only in QEMU generic PCIe root port +(-device pcie-root-port). Capability construction function takes all reservation +fields values from corresponding device properties. By default all of them are +set to -1 to leave root port's default behavior unchanged.
+Usage +===== +A detailed command line would be:
+[qemu-bin + storage options] \ +-m 2G \ +-device ioh3420,bus=pcie.0,id=rp1 \ +-device ioh3420,bus=pcie.0,id=rp2 \
Please don't use ioh3420 in examples, we want to eventually stop using it. Use pcie-root-port instead.
+-device pcie-root-port,bus=pcie.0,id=rp3,bus-reserve=1 \ +-device pcie-pci-bridge,id=br1,bus=rp1 \ +-device pcie-pci-bridge,id=br2,bus=rp2 \ +-device e1000,bus=br1,addr=8
+Then in monitor it's OK to execute next commands: +device_add pcie-pci-bridge,id=br3,bus=rp3
Please add '' at the end of the line
+device_add e1000,bus=br2,addr=1 +device_add e1000,bus=br3,addr=1
+Here you have:
- (1) Cold-plugged:
- Root ports: 1 QEMU generic root port with the capability mentioned above,
2 ioh3420 root ports;
- 2 PCIE-PCI bridges plugged into 2 different root ports;
- e1000 plugged into the first bridge.
- (2) Hot-plugged:
- PCIE-PCI bridge, plugged into QEMU generic root port;
- 2 e1000 cards, one plugged into the cold-plugged PCIE-PCI bridge,
another plugged into the hot-plugged bridge.
+Limitations +=========== +The PCIE-PCI bridge can be hot-plugged only into pcie-root-port that +has proper 'bus-reserve' property value to provide secondary bus for the +hot-plugged bridge.
+Windows 7 and older versions don't support hot-plug devices into the PCIE-PCI bridge. +To enable device hot-plug into the bridge on Linux there're 3 ways: +1) Build shpchp module with this patch http://www.spinics.net/lists/linux-pci/msg63052.html +2) Use kernel 4.14+ where the patch mentioned above is already merged. +3) Set 'msi' property to off - this forced the bridge to use legacy INTx,
- which allows the bridge to notify the OS about hot-plug event without having
- BUSMASTER set.
+Implementation +============== +The PCIE-PCI bridge is based on PCI-PCI bridge, but also accumulates PCI Express +features as a PCI Express device (is_express=1).
After addressing my and Laszlo's comments: Reviewed-by: Marcel Apfelbaum marcel@redhat.com
Thanks, Marcel