Notes: - Sorry for the late submission, I was waiting for dynamic ACPI series to get merged in order to submit - my bad. - The prev version (v2) was wrongfully tagged by me as RFC, it was actually ready but not rebased. V3 only rebases with no actual functionality changed. - This series is not that big, patches 1-8 are really small and can be submitted separately, however I preferred to keep them here to get the whole picture.
The series is fully functional. - Limitations: - Pxb's bus does not support hotplug. It will be addressed on top of this series because is already getting to big. - Depends on: - [SeaBIOS] [PATCH V3 0/2] fw/pci: better support for multiple host bridges - It was already reviewed/accepted by the maintainers, it will be merged after this series gets accepted. - You are more than welcome to try using: -device pxb-device,id=pxb,bus_nr=4,numa_node=1 -device e1000,bus=pxb,addr=0x1 -bios <patched with the above series>
v3->v4: - Addressed Michael S. Tsirkin's review: - refactored build_prt method (patch 11/25) hw/apci: add _PRT method for extra PCI root busses - Addressed Igor Mammedov's reiew - add assert to aml_index (patch 5/25) - Fixed aml_equal implementation (patch 1/25)
v2->v3: - Rebased on Michael S. Tsirkin's pci branch (that includes now all the dependencies) - Refactored acpi terms patch into multiple patches to match Igor's design.
v1->v2: - Add support for multiple pxb devices. - Attach pxb's bus to specific NUMA node. - Got rid of the hacks from prev version. - Tested also for Win7 and Fedora 20, and for virtio blk devices. - Several bug-fixes resulting in a stable version ready for submission.
Reasoning: We need multiple primary busess for a few reasons, the most important one is to be able to associate a pass-trough device with a guest NUMA node. The OS-es are able to associate a NUMA node only to a primary bus, not to a specific PCI device or a pci-2-pci bridge. PC machines support multiple NUMA nodes for CPUs and memory, however the IO was not yet supported.
patch 1-9 adds the necessary acpi constructs based on Igor's series patch 10-13 implements acpi code needed to expose the pxb's primary bus to guests patch 14 separates the pci_bus code into a designated file patch 15-19 handles the implicit assumptions in code that only one primary bus can exist patch 20 handles the actual implementation of the PXB devices patch 21-22 enables the device patch 23 implements PXB map_irq function, (can be squashed into the actual PXB) patch 24-25 adds NUMA support
Marcel Apfelbaum (25): acpi: fix aml_equal term implementation acpi: add aml_or() term acpi: add aml_add() term acpi: add aml_lless() term acpi: add aml_index() term acpi: add aml_shiftleft() term acpi: add aml_shiftright() term acpi: add aml_increment() term acpi: add aml_while() term hw/acpi: add support for multiple root busses hw/apci: add _PRT method for extra PCI root busses hw/acpi: add _CRS method for extra root busses hw/acpi: remove from root bus 0 the crs resources used by other busses. hw/pci: move pci bus related code to separate files hw/pci: made pci_bus_is_root a PCIBusClass method hw/pci: made pci_bus_num a PCIBusClass method hw/pci: introduce TYPE_PCI_MAIN_HOST_BRIDGE interface hw/pci: removed 'rootbus nr is 0' assumption from qmp_pci_query hw/pci: implement iteration over multiple host bridges hw/pci: introduce PCI Expander Bridge (PXB) hw/pci: inform bios if the system has more than one pci bridge hw/pci: piix - suport multiple host bridges hw/pxb: add map_irq func hw/pci_bus: add support for NUMA nodes hw/pxb: add numa_node parameter
arch_init.c | 1 + hw/acpi/aml-build.c | 79 +++++- hw/alpha/typhoon.c | 1 + hw/i386/acpi-build.c | 347 +++++++++++++++++++++++- hw/i386/kvm/pci-assign.c | 1 + hw/i386/pc.c | 13 + hw/mips/gt64xxx_pci.c | 1 + hw/pci-bridge/Makefile.objs | 1 + hw/pci-bridge/pci_expander_bridge.c | 208 +++++++++++++++ hw/pci-host/bonito.c | 1 + hw/pci-host/grackle.c | 1 + hw/pci-host/piix.c | 63 ++++- hw/pci-host/ppce500.c | 1 + hw/pci-host/q35.c | 5 + hw/pci-host/uninorth.c | 1 + hw/pci/Makefile.objs | 2 +- hw/pci/pci.c | 501 +--------------------------------- hw/pci/pci_bus.c | 517 ++++++++++++++++++++++++++++++++++++ hw/pci/pci_host.c | 6 + hw/ppc/ppc4xx_pci.c | 1 + hw/scsi/megasas.c | 1 + hw/sh4/r2d.c | 1 + hw/sh4/sh_pci.c | 1 + hw/vfio/pci.c | 1 + hw/xen/xen_pt.c | 1 + include/hw/acpi/aml-build.h | 8 + include/hw/pci/pci.h | 6 +- include/hw/pci/pci_bus.h | 35 +++ include/hw/pci/pci_host.h | 11 + include/sysemu/sysemu.h | 1 + 30 files changed, 1307 insertions(+), 510 deletions(-) create mode 100644 hw/pci-bridge/pci_expander_bridge.c create mode 100644 hw/pci/pci_bus.c
The DefLEqual op does not have a target operand. Remove it.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/acpi/aml-build.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 876cada..0d14561 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -542,7 +542,6 @@ Aml *aml_equal(Aml *arg1, Aml *arg2) Aml *var = aml_opcode(0x93 /* LequalOp */); aml_append(var, arg1); aml_append(var, arg2); - build_append_int(var->buf, 0x00); /* NullNameOp */ return var; }
On Sun, 8 Mar 2015 13:16:03 +0200 Marcel Apfelbaum marcel@redhat.com wrote:
The DefLEqual op does not have a target operand. Remove it.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
Reviewed-by: Igor Mammedov imammedo@redhat.com
hw/acpi/aml-build.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 876cada..0d14561 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -542,7 +542,6 @@ Aml *aml_equal(Aml *arg1, Aml *arg2) Aml *var = aml_opcode(0x93 /* LequalOp */); aml_append(var, arg1); aml_append(var, arg2);
- build_append_int(var->buf, 0x00); /* NullNameOp */
It's just happens to work in case CPU and PCI hotplug because it LEqual was the only predicate in if block and NullNameOp was considered as part of inner code-block, like: if (LEqual(a, b)) { NullName; // nop ... }
return var;
}
On Mon, Mar 09, 2015 at 11:28:22AM +0100, Igor Mammedov wrote:
On Sun, 8 Mar 2015 13:16:03 +0200 Marcel Apfelbaum marcel@redhat.com wrote:
The DefLEqual op does not have a target operand. Remove it.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
Reviewed-by: Igor Mammedov imammedo@redhat.com
hw/acpi/aml-build.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 876cada..0d14561 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -542,7 +542,6 @@ Aml *aml_equal(Aml *arg1, Aml *arg2) Aml *var = aml_opcode(0x93 /* LequalOp */); aml_append(var, arg1); aml_append(var, arg2);
- build_append_int(var->buf, 0x00); /* NullNameOp */
It's just happens to work in case CPU and PCI hotplug because it LEqual was the only predicate in if block and NullNameOp was considered as part of inner code-block, like: if (LEqual(a, b)) { NullName; // nop ... }
So - maybe aml_if should get 3rd parameter - the command?
return var;
}
On Mon, 9 Mar 2015 12:04:51 +0100 "Michael S. Tsirkin" mst@redhat.com wrote:
On Mon, Mar 09, 2015 at 11:28:22AM +0100, Igor Mammedov wrote:
On Sun, 8 Mar 2015 13:16:03 +0200 Marcel Apfelbaum marcel@redhat.com wrote:
The DefLEqual op does not have a target operand. Remove it.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
Reviewed-by: Igor Mammedov imammedo@redhat.com
hw/acpi/aml-build.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 876cada..0d14561 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -542,7 +542,6 @@ Aml *aml_equal(Aml *arg1, Aml *arg2) Aml *var = aml_opcode(0x93 /* LequalOp */); aml_append(var, arg1); aml_append(var, arg2);
- build_append_int(var->buf, 0x00); /* NullNameOp */
It's just happens to work in case CPU and PCI hotplug because it LEqual was the only predicate in if block and NullNameOp was considered as part of inner code-block, like: if (LEqual(a, b)) { NullName; // nop ... }
So - maybe aml_if should get 3rd parameter - the command?
it's not only one command it's block of AML code inside of 'if' scope. Adding 3rd argument would mean inventing another not defined by spec element like aml_block() where you could put AML items that are in block, I'd like to keep non spec items to minimum and not add them unless we have to.
Then for consistence purposes we would add this 'aml_block' argument to other block constructs like 'device, scope, package, ...' So I think current way of defining context and then putting items in it is pretty clean way as opposed to doing it backwards, first defining elements somewhere and then passing that somewhere as argument to a AML block construct.
return var;
}
On Mon, Mar 09, 2015 at 01:26:40PM +0100, Igor Mammedov wrote:
On Mon, 9 Mar 2015 12:04:51 +0100 "Michael S. Tsirkin" mst@redhat.com wrote:
On Mon, Mar 09, 2015 at 11:28:22AM +0100, Igor Mammedov wrote:
On Sun, 8 Mar 2015 13:16:03 +0200 Marcel Apfelbaum marcel@redhat.com wrote:
The DefLEqual op does not have a target operand. Remove it.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
Reviewed-by: Igor Mammedov imammedo@redhat.com
hw/acpi/aml-build.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 876cada..0d14561 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -542,7 +542,6 @@ Aml *aml_equal(Aml *arg1, Aml *arg2) Aml *var = aml_opcode(0x93 /* LequalOp */); aml_append(var, arg1); aml_append(var, arg2);
- build_append_int(var->buf, 0x00); /* NullNameOp */
It's just happens to work in case CPU and PCI hotplug because it LEqual was the only predicate in if block and NullNameOp was considered as part of inner code-block, like: if (LEqual(a, b)) { NullName; // nop ... }
So - maybe aml_if should get 3rd parameter - the command?
it's not only one command it's block of AML code inside of 'if' scope. Adding 3rd argument would mean inventing another not defined by spec element like aml_block() where you could put AML items that are in block, I'd like to keep non spec items to minimum and not add them unless we have to.
In fact, it's TermList in spec, isn't it? But I don't insist.
Then for consistence purposes we would add this 'aml_block' argument to other block constructs like 'device, scope, package, ...' So I think current way of defining context and then putting items in it is pretty clean way as opposed to doing it backwards, first defining elements somewhere and then passing that somewhere as argument to a AML block construct.
I think it's just a question of adding a convenience wrapper for the most common case.
/* Convenience wrapper for when there's a single * term in TermList */ Aml *aml_if_then_1term(Aml *predicate, Aml *term) { Aml *if_ctx = aml_if(predicate);
aml_append(if_ctx, term); return if_ctx; }
Add encoding for ACPI DefOr Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/acpi/aml-build.c | 10 ++++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 11 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 0d14561..603c0c4 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -448,6 +448,16 @@ Aml *aml_and(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefOr */ +Aml *aml_or(Aml *arg1, Aml *arg2) +{ + Aml *var = aml_opcode(0x7D /* OrOp */); + aml_append(var, arg1); + aml_append(var, arg2); + build_append_int(var->buf, 0x00); /* NullNameOp */ + return var; +} + /* ACPI 1.0b: 16.2.5.3 Type 1 Opcodes Encoding: DefNotify */ Aml *aml_notify(Aml *arg1, Aml *arg2) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index f6735ea..84d4180 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -137,6 +137,7 @@ Aml *aml_int(const uint64_t val); Aml *aml_arg(int pos); Aml *aml_store(Aml *val, Aml *target); Aml *aml_and(Aml *arg1, Aml *arg2); +Aml *aml_or(Aml *arg1, Aml *arg2); Aml *aml_notify(Aml *arg1, Aml *arg2); Aml *aml_call1(const char *method, Aml *arg1); Aml *aml_call2(const char *method, Aml *arg1, Aml *arg2);
On 2015/3/8 19:16, Marcel Apfelbaum wrote:
Add encoding for ACPI DefOr Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/acpi/aml-build.c | 10 ++++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 11 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 0d14561..603c0c4 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -448,6 +448,16 @@ Aml *aml_and(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefOr */ +Aml *aml_or(Aml *arg1, Aml *arg2) +{
- Aml *var = aml_opcode(0x7D /* OrOp */);
- aml_append(var, arg1);
- aml_append(var, arg2);
- build_append_int(var->buf, 0x00); /* NullNameOp */
Hi,
I notice that MST has sent a patch which uses build_append_byte instead of build_append_int. Maybe we can fix this patch before apply.
Thanks, Shannon
- return var;
+}
On 03/09/2015 09:58 AM, Shannon Zhao wrote:
On 2015/3/8 19:16, Marcel Apfelbaum wrote:
Add encoding for ACPI DefOr Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/acpi/aml-build.c | 10 ++++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 11 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 0d14561..603c0c4 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -448,6 +448,16 @@ Aml *aml_and(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefOr */ +Aml *aml_or(Aml *arg1, Aml *arg2) +{
- Aml *var = aml_opcode(0x7D /* OrOp */);
- aml_append(var, arg1);
- aml_append(var, arg2);
- build_append_int(var->buf, 0x00); /* NullNameOp */
Hi,
Hi Shannon, thank you for your review.
I notice that MST has sent a patch which uses build_append_byte instead of build_append_int. Maybe we can fix this patch before apply.
Sure, thank you for bringing this to my attention. Marcel
Thanks, Shannon
- return var;
+}
Add encoding for ACPI DefAdd Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/acpi/aml-build.c | 10 ++++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 11 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 603c0c4..be60f4e 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -458,6 +458,16 @@ Aml *aml_or(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefAdd */ +Aml *aml_add(Aml *arg1, Aml *arg2) +{ + Aml *var = aml_opcode(0x72 /* AddOp */); + aml_append(var, arg1); + aml_append(var, arg2); + build_append_int(var->buf, 0x00 /* NullNameOp */); + return var; +} + /* ACPI 1.0b: 16.2.5.3 Type 1 Opcodes Encoding: DefNotify */ Aml *aml_notify(Aml *arg1, Aml *arg2) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 84d4180..991eacb 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -138,6 +138,7 @@ Aml *aml_arg(int pos); Aml *aml_store(Aml *val, Aml *target); Aml *aml_and(Aml *arg1, Aml *arg2); Aml *aml_or(Aml *arg1, Aml *arg2); +Aml *aml_add(Aml *arg1, Aml *arg2); Aml *aml_notify(Aml *arg1, Aml *arg2); Aml *aml_call1(const char *method, Aml *arg1); Aml *aml_call2(const char *method, Aml *arg1, Aml *arg2);
Add encoding for ACPI DefLLess Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/acpi/aml-build.c | 9 +++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 10 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index be60f4e..49ba8c1 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -458,6 +458,15 @@ Aml *aml_or(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefLLess */ +Aml *aml_lless(Aml *arg1, Aml *arg2) +{ + Aml *var = aml_opcode(0x95 /* LLessOp */); + aml_append(var, arg1); + aml_append(var, arg2); + return var; +} + /* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefAdd */ Aml *aml_add(Aml *arg1, Aml *arg2) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 991eacb..edc0520 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -138,6 +138,7 @@ Aml *aml_arg(int pos); Aml *aml_store(Aml *val, Aml *target); Aml *aml_and(Aml *arg1, Aml *arg2); Aml *aml_or(Aml *arg1, Aml *arg2); +Aml *aml_lless(Aml *arg1, Aml *arg2); Aml *aml_add(Aml *arg1, Aml *arg2); Aml *aml_notify(Aml *arg1, Aml *arg2); Aml *aml_call1(const char *method, Aml *arg1);
On 2015/3/8 19:16, Marcel Apfelbaum wrote:
Add encoding for ACPI DefLLess Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com
Reviewed-by: Shannon Zhao zhaoshenglong@huawei.com
hw/acpi/aml-build.c | 9 +++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 10 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index be60f4e..49ba8c1 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -458,6 +458,15 @@ Aml *aml_or(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefLLess */ +Aml *aml_lless(Aml *arg1, Aml *arg2) +{
- Aml *var = aml_opcode(0x95 /* LLessOp */);
- aml_append(var, arg1);
- aml_append(var, arg2);
- return var;
+}
/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefAdd */ Aml *aml_add(Aml *arg1, Aml *arg2) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 991eacb..edc0520 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -138,6 +138,7 @@ Aml *aml_arg(int pos); Aml *aml_store(Aml *val, Aml *target); Aml *aml_and(Aml *arg1, Aml *arg2); Aml *aml_or(Aml *arg1, Aml *arg2); +Aml *aml_lless(Aml *arg1, Aml *arg2); Aml *aml_add(Aml *arg1, Aml *arg2); Aml *aml_notify(Aml *arg1, Aml *arg2); Aml *aml_call1(const char *method, Aml *arg1);
Add encoding for ACPI DefIndex Opcode.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/acpi/aml-build.c | 13 +++++++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 14 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 49ba8c1..b3372df 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -477,6 +477,19 @@ Aml *aml_add(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefIndex */ +Aml *aml_index(Aml *arg1, Aml *idx) +{ + Aml *var = aml_opcode(0x88 /* IndexOp */); + + g_assert(arg1->block_flags & (AML_PACKAGE | AML_EXT_PACKAGE | AML_BUFFER)); + + aml_append(var, arg1); + aml_append(var, idx); + build_append_int(var->buf, 0x00 /* NullNameOp */); + return var; +} + /* ACPI 1.0b: 16.2.5.3 Type 1 Opcodes Encoding: DefNotify */ Aml *aml_notify(Aml *arg1, Aml *arg2) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index edc0520..b860732 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -140,6 +140,7 @@ Aml *aml_and(Aml *arg1, Aml *arg2); Aml *aml_or(Aml *arg1, Aml *arg2); Aml *aml_lless(Aml *arg1, Aml *arg2); Aml *aml_add(Aml *arg1, Aml *arg2); +Aml *aml_index(Aml *arg1, Aml *idx); Aml *aml_notify(Aml *arg1, Aml *arg2); Aml *aml_call1(const char *method, Aml *arg1); Aml *aml_call2(const char *method, Aml *arg1, Aml *arg2);
On Sun, 8 Mar 2015 13:16:07 +0200 Marcel Apfelbaum marcel@redhat.com wrote:
Add encoding for ACPI DefIndex Opcode.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/acpi/aml-build.c | 13 +++++++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 14 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 49ba8c1..b3372df 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -477,6 +477,19 @@ Aml *aml_add(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefIndex */ +Aml *aml_index(Aml *arg1, Aml *idx) +{
- Aml *var = aml_opcode(0x88 /* IndexOp */);
- g_assert(arg1->block_flags & (AML_PACKAGE | AML_EXT_PACKAGE | AML_BUFFER));
it couldn't be AML_EXT_PACKAGE, it's either a plain DefPackage or DefBuffer i.e. I'd suggest to add to above check also checking for specific opcodes arg1->op = 0x11 /* buffer */ ...
Also a newer spec (>1.0b) allows to use DefString as argument as well.
- aml_append(var, arg1);
- aml_append(var, idx);
- build_append_int(var->buf, 0x00 /* NullNameOp */);
- return var;
+}
/* ACPI 1.0b: 16.2.5.3 Type 1 Opcodes Encoding: DefNotify */ Aml *aml_notify(Aml *arg1, Aml *arg2) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index edc0520..b860732 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -140,6 +140,7 @@ Aml *aml_and(Aml *arg1, Aml *arg2); Aml *aml_or(Aml *arg1, Aml *arg2); Aml *aml_lless(Aml *arg1, Aml *arg2); Aml *aml_add(Aml *arg1, Aml *arg2); +Aml *aml_index(Aml *arg1, Aml *idx); Aml *aml_notify(Aml *arg1, Aml *arg2); Aml *aml_call1(const char *method, Aml *arg1); Aml *aml_call2(const char *method, Aml *arg1, Aml *arg2);
On Mon, Mar 09, 2015 at 11:39:57AM +0100, Igor Mammedov wrote:
On Sun, 8 Mar 2015 13:16:07 +0200 Marcel Apfelbaum marcel@redhat.com wrote:
Add encoding for ACPI DefIndex Opcode.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/acpi/aml-build.c | 13 +++++++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 14 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 49ba8c1..b3372df 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -477,6 +477,19 @@ Aml *aml_add(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefIndex */ +Aml *aml_index(Aml *arg1, Aml *idx) +{
- Aml *var = aml_opcode(0x88 /* IndexOp */);
- g_assert(arg1->block_flags & (AML_PACKAGE | AML_EXT_PACKAGE | AML_BUFFER));
it couldn't be AML_EXT_PACKAGE, it's either a plain DefPackage or DefBuffer i.e. I'd suggest to add to above check also checking for specific opcodes arg1->op = 0x11 /* buffer */ ...
Also a newer spec (>1.0b) allows to use DefString as argument as well.
You really can have a lot of things there, e.g. a method call will also work. I think it's best to drop these asserts at this point, and maybe blacklist specific bad configs down the road.
- aml_append(var, arg1);
- aml_append(var, idx);
- build_append_int(var->buf, 0x00 /* NullNameOp */);
- return var;
+}
/* ACPI 1.0b: 16.2.5.3 Type 1 Opcodes Encoding: DefNotify */ Aml *aml_notify(Aml *arg1, Aml *arg2) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index edc0520..b860732 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -140,6 +140,7 @@ Aml *aml_and(Aml *arg1, Aml *arg2); Aml *aml_or(Aml *arg1, Aml *arg2); Aml *aml_lless(Aml *arg1, Aml *arg2); Aml *aml_add(Aml *arg1, Aml *arg2); +Aml *aml_index(Aml *arg1, Aml *idx); Aml *aml_notify(Aml *arg1, Aml *arg2); Aml *aml_call1(const char *method, Aml *arg1); Aml *aml_call2(const char *method, Aml *arg1, Aml *arg2);
On 03/09/2015 01:00 PM, Michael S. Tsirkin wrote:
On Mon, Mar 09, 2015 at 11:39:57AM +0100, Igor Mammedov wrote:
On Sun, 8 Mar 2015 13:16:07 +0200 Marcel Apfelbaum marcel@redhat.com wrote:
Add encoding for ACPI DefIndex Opcode.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/acpi/aml-build.c | 13 +++++++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 14 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 49ba8c1..b3372df 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -477,6 +477,19 @@ Aml *aml_add(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefIndex */ +Aml *aml_index(Aml *arg1, Aml *idx) +{
- Aml *var = aml_opcode(0x88 /* IndexOp */);
- g_assert(arg1->block_flags & (AML_PACKAGE | AML_EXT_PACKAGE | AML_BUFFER));
it couldn't be AML_EXT_PACKAGE, it's either a plain DefPackage or DefBuffer i.e. I'd suggest to add to above check also checking for specific opcodes arg1->op = 0x11 /* buffer */ ...
Also a newer spec (>1.0b) allows to use DefString as argument as well.
You really can have a lot of things there, e.g. a method call will also work. I think it's best to drop these asserts at this point, and maybe blacklist specific bad configs down the road.
OK, I'll drop the assert for now.
Thanks, Marcel
Add encoding for ACPI DefShiftLeft Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/acpi/aml-build.c | 10 ++++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 11 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index b3372df..497713e 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -458,6 +458,16 @@ Aml *aml_or(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefShiftLeft */ +Aml *aml_shiftleft(Aml *arg1, Aml *count) +{ + Aml *var = aml_opcode(0x79 /* ShiftLeftOp */); + aml_append(var, arg1); + aml_append(var, count); + build_append_int(var->buf, 0x00); /* NullNameOp */ + return var; +} + /* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefLLess */ Aml *aml_lless(Aml *arg1, Aml *arg2) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index b860732..4ab0533 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -138,6 +138,7 @@ Aml *aml_arg(int pos); Aml *aml_store(Aml *val, Aml *target); Aml *aml_and(Aml *arg1, Aml *arg2); Aml *aml_or(Aml *arg1, Aml *arg2); +Aml *aml_shiftleft(Aml *arg1, Aml *count); Aml *aml_lless(Aml *arg1, Aml *arg2); Aml *aml_add(Aml *arg1, Aml *arg2); Aml *aml_index(Aml *arg1, Aml *idx);
Add encoding for ACPI DefShiftRight Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/acpi/aml-build.c | 10 ++++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 11 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 497713e..09a543c 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -468,6 +468,16 @@ Aml *aml_shiftleft(Aml *arg1, Aml *count) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefShiftRight */ +Aml *aml_shiftright(Aml *arg1, Aml *count) +{ + Aml *var = aml_opcode(0x7A /* ShiftRightOp */); + aml_append(var, arg1); + aml_append(var, count); + build_append_int(var->buf, 0x00); /* NullNameOp */ + return var; +} + /* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefLLess */ Aml *aml_lless(Aml *arg1, Aml *arg2) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 4ab0533..74a9033 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -139,6 +139,7 @@ Aml *aml_store(Aml *val, Aml *target); Aml *aml_and(Aml *arg1, Aml *arg2); Aml *aml_or(Aml *arg1, Aml *arg2); Aml *aml_shiftleft(Aml *arg1, Aml *count); +Aml *aml_shiftright(Aml *arg1, Aml *count); Aml *aml_lless(Aml *arg1, Aml *arg2); Aml *aml_add(Aml *arg1, Aml *arg2); Aml *aml_index(Aml *arg1, Aml *idx);
Add encoding for ACPI DefIncrement Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/acpi/aml-build.c | 8 ++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 9 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 09a543c..6315ea0 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -497,6 +497,14 @@ Aml *aml_add(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefIncrement */ +Aml *aml_increment(Aml *arg) +{ + Aml *var = aml_opcode(0x75 /* IncrementOp */); + aml_append(var, arg); + return var; +} + /* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefIndex */ Aml *aml_index(Aml *arg1, Aml *idx) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 74a9033..c41f400 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -142,6 +142,7 @@ Aml *aml_shiftleft(Aml *arg1, Aml *count); Aml *aml_shiftright(Aml *arg1, Aml *count); Aml *aml_lless(Aml *arg1, Aml *arg2); Aml *aml_add(Aml *arg1, Aml *arg2); +Aml *aml_increment(Aml *arg); Aml *aml_index(Aml *arg1, Aml *idx); Aml *aml_notify(Aml *arg1, Aml *arg2); Aml *aml_call1(const char *method, Aml *arg1);
On 2015/3/8 19:16, Marcel Apfelbaum wrote:
Add encoding for ACPI DefIncrement Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/acpi/aml-build.c | 8 ++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 9 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 09a543c..6315ea0 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -497,6 +497,14 @@ Aml *aml_add(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefIncrement */ +Aml *aml_increment(Aml *arg) +{
- Aml *var = aml_opcode(0x75 /* IncrementOp */);
- aml_append(var, arg);
- return var;
+}
/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefIndex */ Aml *aml_index(Aml *arg1, Aml *idx) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 74a9033..c41f400 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -142,6 +142,7 @@ Aml *aml_shiftleft(Aml *arg1, Aml *count); Aml *aml_shiftright(Aml *arg1, Aml *count); Aml *aml_lless(Aml *arg1, Aml *arg2); Aml *aml_add(Aml *arg1, Aml *arg2); +Aml *aml_increment(Aml *arg); Aml *aml_index(Aml *arg1, Aml *idx); Aml *aml_notify(Aml *arg1, Aml *arg2); Aml *aml_call1(const char *method, Aml *arg1);
On 2015/3/8 19:16, Marcel Apfelbaum wrote:
Add encoding for ACPI DefIncrement Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com
Reviewed-by: Shannon Zhao zhaoshenglong@huawei.com
hw/acpi/aml-build.c | 8 ++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 9 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 09a543c..6315ea0 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -497,6 +497,14 @@ Aml *aml_add(Aml *arg1, Aml *arg2) return var; }
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefIncrement */ +Aml *aml_increment(Aml *arg) +{
- Aml *var = aml_opcode(0x75 /* IncrementOp */);
- aml_append(var, arg);
- return var;
+}
/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefIndex */ Aml *aml_index(Aml *arg1, Aml *idx) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 74a9033..c41f400 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -142,6 +142,7 @@ Aml *aml_shiftleft(Aml *arg1, Aml *count); Aml *aml_shiftright(Aml *arg1, Aml *count); Aml *aml_lless(Aml *arg1, Aml *arg2); Aml *aml_add(Aml *arg1, Aml *arg2); +Aml *aml_increment(Aml *arg); Aml *aml_index(Aml *arg1, Aml *idx); Aml *aml_notify(Aml *arg1, Aml *arg2); Aml *aml_call1(const char *method, Aml *arg1);
Add encoding for ACPI DefWhile Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/acpi/aml-build.c | 8 ++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 9 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 6315ea0..f85a0fb 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -623,6 +623,14 @@ Aml *aml_if(Aml *predicate) return var; }
+/* ACPI 1.0b: 16.2.5.3 Type 1 Opcodes Encoding: DefWhile */ +Aml *aml_while(Aml *predicate) +{ + Aml *var = aml_bundle(0xA2 /* WhileOp */, AML_PACKAGE); + aml_append(var, predicate); + return var; +} + /* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefMethod */ Aml *aml_method(const char *name, int arg_count) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index c41f400..4f695a2 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -189,6 +189,7 @@ Aml *aml_scope(const char *name_format, ...) GCC_FMT_ATTR(1, 2); Aml *aml_device(const char *name_format, ...) GCC_FMT_ATTR(1, 2); Aml *aml_method(const char *name, int arg_count); Aml *aml_if(Aml *predicate); +Aml *aml_while(Aml *predicate); Aml *aml_package(uint8_t num_elements); Aml *aml_buffer(void); Aml *aml_resource_template(void);
On 2015/3/8 19:16, Marcel Apfelbaum wrote:
Add encoding for ACPI DefWhile Opcode.
Reviewed-by: Igor Mammedov imammedo@redhat.com Signed-off-by: Marcel Apfelbaum marcel@redhat.com
Reviewed-by: Shannon Zhao zhaoshenglong@huawei.com
hw/acpi/aml-build.c | 8 ++++++++ include/hw/acpi/aml-build.h | 1 + 2 files changed, 9 insertions(+)
diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 6315ea0..f85a0fb 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -623,6 +623,14 @@ Aml *aml_if(Aml *predicate) return var; }
+/* ACPI 1.0b: 16.2.5.3 Type 1 Opcodes Encoding: DefWhile */ +Aml *aml_while(Aml *predicate) +{
- Aml *var = aml_bundle(0xA2 /* WhileOp */, AML_PACKAGE);
- aml_append(var, predicate);
- return var;
+}
/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefMethod */ Aml *aml_method(const char *name, int arg_count) { diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index c41f400..4f695a2 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -189,6 +189,7 @@ Aml *aml_scope(const char *name_format, ...) GCC_FMT_ATTR(1, 2); Aml *aml_device(const char *name_format, ...) GCC_FMT_ATTR(1, 2); Aml *aml_method(const char *name, int arg_count); Aml *aml_if(Aml *predicate); +Aml *aml_while(Aml *predicate); Aml *aml_package(uint8_t num_elements); Aml *aml_buffer(void); Aml *aml_resource_template(void);
If the machine has several root busses, we need to add them to acpi in order to be properly detected by guests.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/i386/acpi-build.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index b94e47e..e5709e8 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -60,6 +60,8 @@ #include "qom/qom-qobject.h" #include "exec/ram_addr.h"
+#include "qmp-commands.h" + /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and * -M pc-i440fx-2.0. Even if the actual amount of AML generated grows * a little bit, there should be plenty of free space since the DSDT @@ -682,6 +684,36 @@ build_ssdt(GArray *table_data, GArray *linker, /* Reserve space for header */ acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
+ { + PciInfoList *info_list, *info; + Error *err = NULL; + + info_list = qmp_query_pci(&err); + if (err) { + error_free(err); + return; + } + + for (info = info_list; info; info = info->next) { + PciInfo *bus_info = info->value; + + if (bus_info->bus == 0) { + continue; + } + + scope = aml_scope("\_SB"); + dev = aml_device("PC%.02X", (uint8_t)bus_info->bus); + aml_append(dev, aml_name_decl("_UID", + aml_string("PC%.02X", (uint8_t)bus_info->bus))); + aml_append(dev, aml_name_decl("_HID", aml_string("PNP0A03"))); + aml_append(dev, + aml_name_decl("_BBN", aml_int((uint8_t)bus_info->bus))); + aml_append(scope, dev); + aml_append(ssdt, scope); + } + qapi_free_PciInfoList(info_list); + } + scope = aml_scope("\_SB.PCI0"); /* build PCI0._CRS */ crs = aml_resource_template();
On Sun, Mar 08, 2015 at 01:16:12PM +0200, Marcel Apfelbaum wrote:
If the machine has several root busses, we need to add them to acpi in order to be properly detected by guests.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/i386/acpi-build.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index b94e47e..e5709e8 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -60,6 +60,8 @@ #include "qom/qom-qobject.h" #include "exec/ram_addr.h"
+#include "qmp-commands.h"
/* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
- -M pc-i440fx-2.0. Even if the actual amount of AML generated grows
- a little bit, there should be plenty of free space since the DSDT
@@ -682,6 +684,36 @@ build_ssdt(GArray *table_data, GArray *linker, /* Reserve space for header */ acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
- {
PciInfoList *info_list, *info;
Error *err = NULL;
info_list = qmp_query_pci(&err);
if (err) {
error_free(err);
return;
}
for (info = info_list; info; info = info->next) {
PciInfo *bus_info = info->value;
if (bus_info->bus == 0) {
continue;
}
scope = aml_scope("\\_SB");
dev = aml_device("PC%.02X", (uint8_t)bus_info->bus);
aml_append(dev, aml_name_decl("_UID",
aml_string("PC%.02X", (uint8_t)bus_info->bus)));
aml_append(dev, aml_name_decl("_HID", aml_string("PNP0A03")));
aml_append(dev,
aml_name_decl("_BBN", aml_int((uint8_t)bus_info->bus)));
Hmm not all pci buses have hardware-assigned bus numbers, a separate segment is also an option. How about only getting your specific ones?
aml_append(scope, dev);
aml_append(ssdt, scope);
}
qapi_free_PciInfoList(info_list);
- }
- scope = aml_scope("\_SB.PCI0"); /* build PCI0._CRS */ crs = aml_resource_template();
-- 2.1.0
On 03/08/2015 06:10 PM, Michael S. Tsirkin wrote:
On Sun, Mar 08, 2015 at 01:16:12PM +0200, Marcel Apfelbaum wrote:
If the machine has several root busses, we need to add them to acpi in order to be properly detected by guests.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/i386/acpi-build.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index b94e47e..e5709e8 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -60,6 +60,8 @@ #include "qom/qom-qobject.h" #include "exec/ram_addr.h"
+#include "qmp-commands.h"
- /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
- -M pc-i440fx-2.0. Even if the actual amount of AML generated grows
- a little bit, there should be plenty of free space since the DSDT
@@ -682,6 +684,36 @@ build_ssdt(GArray *table_data, GArray *linker, /* Reserve space for header */ acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
- {
PciInfoList *info_list, *info;
Error *err = NULL;
info_list = qmp_query_pci(&err);
if (err) {
error_free(err);
return;
}
for (info = info_list; info; info = info->next) {
PciInfo *bus_info = info->value;
Hi,
if (bus_info->bus == 0) {
continue;
}
[1] This code works only for root buses with bus number > 0
scope = aml_scope("\\_SB");
dev = aml_device("PC%.02X", (uint8_t)bus_info->bus);
aml_append(dev, aml_name_decl("_UID",
aml_string("PC%.02X", (uint8_t)bus_info->bus)));
aml_append(dev, aml_name_decl("_HID", aml_string("PNP0A03")));
aml_append(dev,
aml_name_decl("_BBN", aml_int((uint8_t)bus_info->bus)));
Hmm not all pci buses have hardware-assigned bus numbers, a separate segment is also an option. How about only getting your specific ones?
As you can see it [1] we only deal with buses that have hardware-assigned bus numbers, and if the bus number is > 0, we *must* supply BBN. So at the end, it works only for our PXBs.
Thanks, Marcel
aml_append(scope, dev);
aml_append(ssdt, scope);
}
qapi_free_PciInfoList(info_list);
- }
scope = aml_scope("\\_SB.PCI0"); /* build PCI0._CRS */ crs = aml_resource_template();
-- 2.1.0
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/i386/acpi-build.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index e5709e8..e7a1a36 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -664,6 +664,70 @@ static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus, aml_append(parent_scope, method); }
+static Aml *build_link(Aml *lnk, Aml *lnk_pkg, const char *link_name, int idx) +{ + Aml *if_ctx, *pkg; + + if_ctx = aml_if(aml_equal(lnk, aml_int(idx))); + pkg = aml_package(4); + aml_append(pkg, aml_int(0)); + aml_append(pkg, aml_int(0)); + aml_append(pkg, aml_name(link_name, "")); + aml_append(pkg, aml_int(0)); + aml_append(if_ctx, aml_store(pkg, lnk_pkg)); + + return if_ctx; +} + +static Aml *build_prt(void) +{ + Aml *method, *while_ctx, *pin, *slot, *lnk, *pkg, *res; + + method = aml_method("_PRT", 0); + res = aml_local(0); + pin = aml_local(1); + slot = aml_local(2); + lnk = aml_local(3); + pkg = aml_local(4); + + aml_append(method, aml_store(aml_package(128), res)); + aml_append(method, aml_store(aml_int(0), pin)); + + /* while (pin < 128) */ + while_ctx = aml_while(aml_lless(pin, aml_int(128))); + { + /* slot = pin >> 2 */ + aml_append(while_ctx, + aml_store(aml_shiftright(pin, aml_int(2)), slot)); + /* lnk = (slot + pin) & 3 */ + aml_append(while_ctx, + aml_store(aml_and(aml_add(pin, slot), aml_int(3)), lnk)); + + /* pkg[2] = "LNK[A|B|C|D]", selection based on lnk % 3 */ + aml_append(while_ctx, build_link(lnk, pkg, "LNKD", 0)); + aml_append(while_ctx, build_link(lnk, pkg, "LNKA", 1)); + aml_append(while_ctx, build_link(lnk, pkg, "LNKB", 2)); + aml_append(while_ctx, build_link(lnk, pkg, "LNKC", 3)); + + /* pkg[0] = 0x[slot]FFFF */ + aml_append(while_ctx, + aml_store(aml_or(aml_shiftleft(slot, aml_int(16)), aml_int(0xFFFF)), + aml_index(pkg, aml_int(0)))); + /* pkg[1] = pin & 3 */ + aml_append(while_ctx, + aml_store(aml_and(pin, aml_int(3)), aml_index(pkg, aml_int(1)))); + /* res[pin] = pkg */ + aml_append(while_ctx, aml_store(pkg, aml_index(res, pin))); + /* pin++ */ + aml_append(while_ctx, aml_increment(pin)); + } + aml_append(method, while_ctx); + /* return res*/ + aml_append(method, aml_return(res)); + + return method; +} + static void build_ssdt(GArray *table_data, GArray *linker, AcpiCpuInfo *cpu, AcpiPmInfo *pm, AcpiMiscInfo *misc, @@ -708,6 +772,7 @@ build_ssdt(GArray *table_data, GArray *linker, aml_append(dev, aml_name_decl("_HID", aml_string("PNP0A03"))); aml_append(dev, aml_name_decl("_BBN", aml_int((uint8_t)bus_info->bus))); + aml_append(dev, build_prt()); aml_append(scope, dev); aml_append(ssdt, scope); }
On Sun, Mar 08, 2015 at 01:16:13PM +0200, Marcel Apfelbaum wrote:
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/i386/acpi-build.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index e5709e8..e7a1a36 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -664,6 +664,70 @@ static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus, aml_append(parent_scope, method); }
+static Aml *build_link(Aml *lnk, Aml *lnk_pkg, const char *link_name, int idx)
Pls document arguments, and generally make this function more readable.
+{
- Aml *if_ctx, *pkg;
- if_ctx = aml_if(aml_equal(lnk, aml_int(idx)));
- pkg = aml_package(4);
These assignments can be part of declarations.
- aml_append(pkg, aml_int(0));
- aml_append(pkg, aml_int(0));
- aml_append(pkg, aml_name(link_name, ""));
this sould be aml_name(link_name) I think?
- aml_append(pkg, aml_int(0));
- aml_append(if_ctx, aml_store(pkg, lnk_pkg));
- return if_ctx;
+}
+static Aml *build_prt(void) +{
- Aml *method, *while_ctx, *pin, *slot, *lnk, *pkg, *res;
- method = aml_method("_PRT", 0);
- res = aml_local(0);
- pin = aml_local(1);
- slot = aml_local(2);
- lnk = aml_local(3);
- pkg = aml_local(4);
- aml_append(method, aml_store(aml_package(128), res));
- aml_append(method, aml_store(aml_int(0), pin));
- /* while (pin < 128) */
- while_ctx = aml_while(aml_lless(pin, aml_int(128)));
- {
/* slot = pin >> 2 */
aml_append(while_ctx,
aml_store(aml_shiftright(pin, aml_int(2)), slot));
/* lnk = (slot + pin) & 3 */
aml_append(while_ctx,
aml_store(aml_and(aml_add(pin, slot), aml_int(3)), lnk));
/* pkg[2] = "LNK[A|B|C|D]", selection based on lnk % 3 */
in fact pkg here is not same as pkg above, is it?
aml_append(while_ctx, build_link(lnk, pkg, "LNKD", 0));
aml_append(while_ctx, build_link(lnk, pkg, "LNKA", 1));
aml_append(while_ctx, build_link(lnk, pkg, "LNKB", 2));
aml_append(while_ctx, build_link(lnk, pkg, "LNKC", 3));
/* pkg[0] = 0x[slot]FFFF */
aml_append(while_ctx,
aml_store(aml_or(aml_shiftleft(slot, aml_int(16)), aml_int(0xFFFF)),
aml_index(pkg, aml_int(0))));
/* pkg[1] = pin & 3 */
aml_append(while_ctx,
aml_store(aml_and(pin, aml_int(3)), aml_index(pkg, aml_int(1))));
/* res[pin] = pkg */
aml_append(while_ctx, aml_store(pkg, aml_index(res, pin)));
/* pin++ */
aml_append(while_ctx, aml_increment(pin));
- }
- aml_append(method, while_ctx);
- /* return res*/
- aml_append(method, aml_return(res));
- return method;
+}
static void build_ssdt(GArray *table_data, GArray *linker, AcpiCpuInfo *cpu, AcpiPmInfo *pm, AcpiMiscInfo *misc, @@ -708,6 +772,7 @@ build_ssdt(GArray *table_data, GArray *linker, aml_append(dev, aml_name_decl("_HID", aml_string("PNP0A03"))); aml_append(dev, aml_name_decl("_BBN", aml_int((uint8_t)bus_info->bus)));
aml_append(dev, build_prt()); aml_append(scope, dev); aml_append(ssdt, scope); }
-- 2.1.0
On 03/10/2015 07:07 PM, Michael S. Tsirkin wrote:
On Sun, Mar 08, 2015 at 01:16:13PM +0200, Marcel Apfelbaum wrote:
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/i386/acpi-build.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index e5709e8..e7a1a36 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -664,6 +664,70 @@ static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus, aml_append(parent_scope, method); }
+static Aml *build_link(Aml *lnk, Aml *lnk_pkg, const char *link_name, int idx)
Pls document arguments, and generally make this function more readable.
I'll think of something
+{
- Aml *if_ctx, *pkg;
- if_ctx = aml_if(aml_equal(lnk, aml_int(idx)));
- pkg = aml_package(4);
These assignments can be part of declarations.
OK
- aml_append(pkg, aml_int(0));
- aml_append(pkg, aml_int(0));
- aml_append(pkg, aml_name(link_name, ""));
this sould be aml_name(link_name) I think?
It doesn't work without the "" agument, works only for literals. If you have any idea how to make it work, I'll be glad to change it.
- aml_append(pkg, aml_int(0));
- aml_append(if_ctx, aml_store(pkg, lnk_pkg));
- return if_ctx;
+}
+static Aml *build_prt(void) +{
- Aml *method, *while_ctx, *pin, *slot, *lnk, *pkg, *res;
- method = aml_method("_PRT", 0);
- res = aml_local(0);
- pin = aml_local(1);
- slot = aml_local(2);
- lnk = aml_local(3);
- pkg = aml_local(4);
- aml_append(method, aml_store(aml_package(128), res));
- aml_append(method, aml_store(aml_int(0), pin));
- /* while (pin < 128) */
- while_ctx = aml_while(aml_lless(pin, aml_int(128)));
- {
/* slot = pin >> 2 */
aml_append(while_ctx,
aml_store(aml_shiftright(pin, aml_int(2)), slot));
/* lnk = (slot + pin) & 3 */
aml_append(while_ctx,
aml_store(aml_and(aml_add(pin, slot), aml_int(3)), lnk));
/* pkg[2] = "LNK[A|B|C|D]", selection based on lnk % 3 */
in fact pkg here is not same as pkg above, is it?
The same, but I can move the declaration to the while block.
Thanks, Marcel
aml_append(while_ctx, build_link(lnk, pkg, "LNKD", 0));
aml_append(while_ctx, build_link(lnk, pkg, "LNKA", 1));
aml_append(while_ctx, build_link(lnk, pkg, "LNKB", 2));
aml_append(while_ctx, build_link(lnk, pkg, "LNKC", 3));
/* pkg[0] = 0x[slot]FFFF */
aml_append(while_ctx,
aml_store(aml_or(aml_shiftleft(slot, aml_int(16)), aml_int(0xFFFF)),
aml_index(pkg, aml_int(0))));
/* pkg[1] = pin & 3 */
aml_append(while_ctx,
aml_store(aml_and(pin, aml_int(3)), aml_index(pkg, aml_int(1))));
/* res[pin] = pkg */
aml_append(while_ctx, aml_store(pkg, aml_index(res, pin)));
/* pin++ */
aml_append(while_ctx, aml_increment(pin));
- }
- aml_append(method, while_ctx);
- /* return res*/
- aml_append(method, aml_return(res));
- return method;
+}
- static void build_ssdt(GArray *table_data, GArray *linker, AcpiCpuInfo *cpu, AcpiPmInfo *pm, AcpiMiscInfo *misc,
@@ -708,6 +772,7 @@ build_ssdt(GArray *table_data, GArray *linker, aml_append(dev, aml_name_decl("_HID", aml_string("PNP0A03"))); aml_append(dev, aml_name_decl("_BBN", aml_int((uint8_t)bus_info->bus)));
aml_append(dev, build_prt()); aml_append(scope, dev); aml_append(ssdt, scope); }
-- 2.1.0
On Tue, Mar 10, 2015 at 07:26:13PM +0200, Marcel Apfelbaum wrote:
On 03/10/2015 07:07 PM, Michael S. Tsirkin wrote:
On Sun, Mar 08, 2015 at 01:16:13PM +0200, Marcel Apfelbaum wrote:
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/i386/acpi-build.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index e5709e8..e7a1a36 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -664,6 +664,70 @@ static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus, aml_append(parent_scope, method); }
+static Aml *build_link(Aml *lnk, Aml *lnk_pkg, const char *link_name, int idx)
Pls document arguments, and generally make this function more readable.
I'll think of something
+{
- Aml *if_ctx, *pkg;
- if_ctx = aml_if(aml_equal(lnk, aml_int(idx)));
- pkg = aml_package(4);
These assignments can be part of declarations.
OK
- aml_append(pkg, aml_int(0));
- aml_append(pkg, aml_int(0));
- aml_append(pkg, aml_name(link_name, ""));
this sould be aml_name(link_name) I think?
It doesn't work without the "" agument, works only for literals. If you have any idea how to make it work, I'll be glad to change it.
So use "%s" format then.
- aml_append(pkg, aml_int(0));
- aml_append(if_ctx, aml_store(pkg, lnk_pkg));
- return if_ctx;
+}
+static Aml *build_prt(void) +{
- Aml *method, *while_ctx, *pin, *slot, *lnk, *pkg, *res;
- method = aml_method("_PRT", 0);
- res = aml_local(0);
- pin = aml_local(1);
- slot = aml_local(2);
- lnk = aml_local(3);
- pkg = aml_local(4);
- aml_append(method, aml_store(aml_package(128), res));
- aml_append(method, aml_store(aml_int(0), pin));
- /* while (pin < 128) */
- while_ctx = aml_while(aml_lless(pin, aml_int(128)));
- {
/* slot = pin >> 2 */
aml_append(while_ctx,
aml_store(aml_shiftright(pin, aml_int(2)), slot));
/* lnk = (slot + pin) & 3 */
aml_append(while_ctx,
aml_store(aml_and(aml_add(pin, slot), aml_int(3)), lnk));
/* pkg[2] = "LNK[A|B|C|D]", selection based on lnk % 3 */
in fact pkg here is not same as pkg above, is it?
The same, but I can move the declaration to the while block.
Thanks, Marcel
aml_append(while_ctx, build_link(lnk, pkg, "LNKD", 0));
aml_append(while_ctx, build_link(lnk, pkg, "LNKA", 1));
aml_append(while_ctx, build_link(lnk, pkg, "LNKB", 2));
aml_append(while_ctx, build_link(lnk, pkg, "LNKC", 3));
/* pkg[0] = 0x[slot]FFFF */
aml_append(while_ctx,
aml_store(aml_or(aml_shiftleft(slot, aml_int(16)), aml_int(0xFFFF)),
aml_index(pkg, aml_int(0))));
/* pkg[1] = pin & 3 */
aml_append(while_ctx,
aml_store(aml_and(pin, aml_int(3)), aml_index(pkg, aml_int(1))));
/* res[pin] = pkg */
aml_append(while_ctx, aml_store(pkg, aml_index(res, pin)));
/* pin++ */
aml_append(while_ctx, aml_increment(pin));
- }
- aml_append(method, while_ctx);
- /* return res*/
- aml_append(method, aml_return(res));
- return method;
+}
static void build_ssdt(GArray *table_data, GArray *linker, AcpiCpuInfo *cpu, AcpiPmInfo *pm, AcpiMiscInfo *misc, @@ -708,6 +772,7 @@ build_ssdt(GArray *table_data, GArray *linker, aml_append(dev, aml_name_decl("_HID", aml_string("PNP0A03"))); aml_append(dev, aml_name_decl("_BBN", aml_int((uint8_t)bus_info->bus)));
aml_append(dev, build_prt()); aml_append(scope, dev); aml_append(ssdt, scope); }
-- 2.1.0
On 2015/3/11 1:40, Michael S. Tsirkin wrote:
On Tue, Mar 10, 2015 at 07:26:13PM +0200, Marcel Apfelbaum wrote:
On 03/10/2015 07:07 PM, Michael S. Tsirkin wrote:
On Sun, Mar 08, 2015 at 01:16:13PM +0200, Marcel Apfelbaum wrote:
>Signed-off-by: Marcel Apfelbaum marcel@redhat.com >--- > hw/i386/acpi-build.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 65 insertions(+) > >diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c >index e5709e8..e7a1a36 100644 >--- a/hw/i386/acpi-build.c >+++ b/hw/i386/acpi-build.c >@@ -664,6 +664,70 @@ static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus, > aml_append(parent_scope, method); > } > >+static Aml *build_link(Aml *lnk, Aml *lnk_pkg, const char *link_name, int idx)
Pls document arguments, and generally make this function more readable.
I'll think of something
>+{ >+ Aml *if_ctx, *pkg; >+ >+ if_ctx = aml_if(aml_equal(lnk, aml_int(idx))); >+ pkg = aml_package(4);
These assignments can be part of declarations.
OK
>+ aml_append(pkg, aml_int(0)); >+ aml_append(pkg, aml_int(0)); >+ aml_append(pkg, aml_name(link_name, ""));
this sould be aml_name(link_name) I think?
It doesn't work without the "" agument, works only for literals. If you have any idea how to make it work, I'll be glad to change it.
So use "%s" format then.
Yes, use aml_name("%s", link_name) it should work.
Save the IO/mem/bus numbers ranges assigned to the extra root busses to be removed from the root bus 0 range.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/i386/acpi-build.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 149 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index e7a1a36..f4d8816 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -728,6 +728,148 @@ static Aml *build_prt(void) return method; }
+typedef struct PciRangeEntry { + QLIST_ENTRY(PciRangeEntry) entry; + int64_t base; + int64_t limit; +} PciRangeEntry; + +typedef QLIST_HEAD(PciRangeQ, PciRangeEntry) PciRangeQ; + +static void pci_range_insert(PciRangeQ *list, int64_t base, int64_t limit) +{ + PciRangeEntry *entry, *next, *e; + + if (!base) { + return; + } + + e = g_malloc(sizeof(*entry)); + e->base = base; + e->limit = limit; + + if (QLIST_EMPTY(list)) { + QLIST_INSERT_HEAD(list, e, entry); + } else { + QLIST_FOREACH_SAFE(entry, list, entry, next) { + if (base < entry->base) { + QLIST_INSERT_BEFORE(entry, e, entry); + break; + } else if (!next) { + QLIST_INSERT_AFTER(entry, e, entry); + break; + } + } + } +} + +static void pci_range_list_free(PciRangeQ *list) +{ + PciRangeEntry *entry, *next; + + QLIST_FOREACH_SAFE(entry, list, entry, next) { + QLIST_REMOVE(entry, entry); + g_free(entry); + } +} + +static Aml *build_crs(PcPciInfo *pci, PciInfo *bus_info, + PciRangeQ *io_ranges, PciRangeQ *mem_ranges) +{ + PciDeviceInfoList *dev_list; + PciMemoryRange range; + uint8_t max_bus; + Aml *crs; + + crs = aml_resource_template(); + max_bus = bus_info->bus; + + for (dev_list = bus_info->devices; dev_list; dev_list = dev_list->next) { + PciMemoryRegionList *region; + + for (region = dev_list->value->regions; region; region = region->next) { + range.base = region->value->address; + range.limit = region->value->address + region->value->size - 1; + + if (!strcmp(region->value->type, "io")) { + aml_append(crs, + aml_word_io(aml_min_fixed, aml_max_fixed, + aml_pos_decode, aml_entire_range, + 0, + range.base, + range.limit, + 0, + range.limit - range.base + 1)); + pci_range_insert(io_ranges, range.base, range.limit); + } else { /* "memory" */ + aml_append(crs, + aml_dword_memory(aml_pos_decode, aml_min_fixed, + aml_max_fixed, aml_non_cacheable, + aml_ReadWrite, + 0, + range.base, + range.limit, + 0, + range.limit - range.base + 1)); + pci_range_insert(mem_ranges, range.base, range.limit); + } + } + + if (dev_list->value->has_pci_bridge) { + PciBridgeInfo *bridge_info = dev_list->value->pci_bridge; + + if (bridge_info->bus.subordinate > max_bus) { + max_bus = bridge_info->bus.subordinate; + } + + range = *bridge_info->bus.io_range; + aml_append(crs, + aml_word_io(aml_min_fixed, aml_max_fixed, + aml_pos_decode, aml_entire_range, + 0, + range.base, + range.limit, + 0, + range.limit - range.base + 1)); + pci_range_insert(io_ranges, range.base, range.limit); + + range = *bridge_info->bus.memory_range; + aml_append(crs, + aml_dword_memory(aml_pos_decode, aml_min_fixed, + aml_max_fixed, aml_non_cacheable, + aml_ReadWrite, + 0, + range.base, + range.limit, + 0, + range.limit - range.base + 1)); + pci_range_insert(mem_ranges, range.base, range.limit); + + range = *bridge_info->bus.prefetchable_range; + aml_append(crs, + aml_dword_memory(aml_pos_decode, aml_min_fixed, + aml_max_fixed, aml_non_cacheable, + aml_ReadWrite, + 0, + range.base, + range.limit, + 0, + range.limit - range.base + 1)); + pci_range_insert(mem_ranges, range.base, range.limit); + } + } + + aml_append(crs, + aml_word_bus_number(aml_min_fixed, aml_max_fixed, aml_pos_decode, + 0, + bus_info->bus, + max_bus, + 0, + max_bus - bus_info->bus + 1)); + + return crs; +} + static void build_ssdt(GArray *table_data, GArray *linker, AcpiCpuInfo *cpu, AcpiPmInfo *pm, AcpiMiscInfo *misc, @@ -737,6 +879,8 @@ build_ssdt(GArray *table_data, GArray *linker, uint32_t nr_mem = machine->ram_slots; unsigned acpi_cpus = guest_info->apic_id_limit; Aml *ssdt, *sb_scope, *scope, *pkg, *dev, *method, *crs, *field, *ifctx; + PciRangeQ io_ranges = QLIST_HEAD_INITIALIZER(io_ranges); + PciRangeQ mem_ranges = QLIST_HEAD_INITIALIZER(mem_ranges); int i;
ssdt = init_aml_allocator(); @@ -773,9 +917,14 @@ build_ssdt(GArray *table_data, GArray *linker, aml_append(dev, aml_name_decl("_BBN", aml_int((uint8_t)bus_info->bus))); aml_append(dev, build_prt()); + crs = build_crs(pci, bus_info, &io_ranges, &mem_ranges); + aml_append(dev, aml_name_decl("_CRS", crs)); aml_append(scope, dev); aml_append(ssdt, scope); } + + pci_range_list_free(&io_ranges); + pci_range_list_free(&mem_ranges); qapi_free_PciInfoList(info_list); }
On Sun, Mar 08, 2015 at 01:16:14PM +0200, Marcel Apfelbaum wrote:
Save the IO/mem/bus numbers ranges assigned to the extra root busses to be removed from the root bus 0 range.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/i386/acpi-build.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 149 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index e7a1a36..f4d8816 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -728,6 +728,148 @@ static Aml *build_prt(void) return method; }
+typedef struct PciRangeEntry {
- QLIST_ENTRY(PciRangeEntry) entry;
- int64_t base;
- int64_t limit;
+} PciRangeEntry;
+typedef QLIST_HEAD(PciRangeQ, PciRangeEntry) PciRangeQ;
+static void pci_range_insert(PciRangeQ *list, int64_t base, int64_t limit)
Don't start with pci_ or Pci prefixes, this is for pci things. signed values for base/limit might be problematic, even though currenly guests don't assign such values normally. I know it's a qmp bug/feature, but you don't have to use qmp.
+{
- PciRangeEntry *entry, *next, *e;
- if (!base) {
return;
- }
- e = g_malloc(sizeof(*entry));
- e->base = base;
- e->limit = limit;
- if (QLIST_EMPTY(list)) {
QLIST_INSERT_HEAD(list, e, entry);
- } else {
QLIST_FOREACH_SAFE(entry, list, entry, next) {
if (base < entry->base) {
QLIST_INSERT_BEFORE(entry, e, entry);
break;
} else if (!next) {
QLIST_INSERT_AFTER(entry, e, entry);
break;
}
}
- }
+}
+static void pci_range_list_free(PciRangeQ *list) +{
- PciRangeEntry *entry, *next;
- QLIST_FOREACH_SAFE(entry, list, entry, next) {
QLIST_REMOVE(entry, entry);
g_free(entry);
- }
+}
Not very happy about manual memory management here. Isn't there something you can do with And how about using g_array_sort to sort things?
+static Aml *build_crs(PcPciInfo *pci, PciInfo *bus_info,
PciRangeQ *io_ranges, PciRangeQ *mem_ranges)
+{
- PciDeviceInfoList *dev_list;
- PciMemoryRange range;
- uint8_t max_bus;
- Aml *crs;
- crs = aml_resource_template();
- max_bus = bus_info->bus;
- for (dev_list = bus_info->devices; dev_list; dev_list = dev_list->next) {
PciMemoryRegionList *region;
for (region = dev_list->value->regions; region; region = region->next) {
range.base = region->value->address;
range.limit = region->value->address + region->value->size - 1;
if (!strcmp(region->value->type, "io")) {
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0,
range.base,
range.limit,
0,
range.limit - range.base + 1));
pci_range_insert(io_ranges, range.base, range.limit);
} else { /* "memory" */
aml_append(crs,
aml_dword_memory(aml_pos_decode, aml_min_fixed,
aml_max_fixed, aml_non_cacheable,
aml_ReadWrite,
0,
range.base,
range.limit,
0,
range.limit - range.base + 1));
pci_range_insert(mem_ranges, range.base, range.limit);
}
}
if (dev_list->value->has_pci_bridge) {
PciBridgeInfo *bridge_info = dev_list->value->pci_bridge;
if (bridge_info->bus.subordinate > max_bus) {
What's this doing?
max_bus = bridge_info->bus.subordinate;
}
range = *bridge_info->bus.io_range;
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0,
range.base,
range.limit,
0,
range.limit - range.base + 1));
pci_range_insert(io_ranges, range.base, range.limit);
range = *bridge_info->bus.memory_range;
aml_append(crs,
aml_dword_memory(aml_pos_decode, aml_min_fixed,
aml_max_fixed, aml_non_cacheable,
aml_ReadWrite,
0,
range.base,
range.limit,
0,
range.limit - range.base + 1));
pci_range_insert(mem_ranges, range.base, range.limit);
range = *bridge_info->bus.prefetchable_range;
aml_append(crs,
aml_dword_memory(aml_pos_decode, aml_min_fixed,
aml_max_fixed, aml_non_cacheable,
aml_ReadWrite,
0,
range.base,
range.limit,
0,
range.limit - range.base + 1));
pci_range_insert(mem_ranges, range.base, range.limit);
}
- }
- aml_append(crs,
aml_word_bus_number(aml_min_fixed, aml_max_fixed, aml_pos_decode,
0,
bus_info->bus,
max_bus,
0,
max_bus - bus_info->bus + 1));
- return crs;
+}
static void build_ssdt(GArray *table_data, GArray *linker, AcpiCpuInfo *cpu, AcpiPmInfo *pm, AcpiMiscInfo *misc, @@ -737,6 +879,8 @@ build_ssdt(GArray *table_data, GArray *linker, uint32_t nr_mem = machine->ram_slots; unsigned acpi_cpus = guest_info->apic_id_limit; Aml *ssdt, *sb_scope, *scope, *pkg, *dev, *method, *crs, *field, *ifctx;
PciRangeQ io_ranges = QLIST_HEAD_INITIALIZER(io_ranges);
PciRangeQ mem_ranges = QLIST_HEAD_INITIALIZER(mem_ranges); int i;
ssdt = init_aml_allocator();
@@ -773,9 +917,14 @@ build_ssdt(GArray *table_data, GArray *linker, aml_append(dev, aml_name_decl("_BBN", aml_int((uint8_t)bus_info->bus))); aml_append(dev, build_prt());
crs = build_crs(pci, bus_info, &io_ranges, &mem_ranges);
aml_append(dev, aml_name_decl("_CRS", crs)); aml_append(scope, dev); aml_append(ssdt, scope); }
pci_range_list_free(&io_ranges);
}pci_range_list_free(&mem_ranges); qapi_free_PciInfoList(info_list);
-- 2.1.0
On 03/08/2015 06:27 PM, Michael S. Tsirkin wrote:
On Sun, Mar 08, 2015 at 01:16:14PM +0200, Marcel Apfelbaum wrote:
Save the IO/mem/bus numbers ranges assigned to the extra root busses to be removed from the root bus 0 range.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com
hw/i386/acpi-build.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 149 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index e7a1a36..f4d8816 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -728,6 +728,148 @@ static Aml *build_prt(void) return method; }
+typedef struct PciRangeEntry {
- QLIST_ENTRY(PciRangeEntry) entry;
- int64_t base;
- int64_t limit;
+} PciRangeEntry;
+typedef QLIST_HEAD(PciRangeQ, PciRangeEntry) PciRangeQ;
+static void pci_range_insert(PciRangeQ *list, int64_t base, int64_t limit)
Don't start with pci_ or Pci prefixes, this is for pci things. signed values for base/limit might be problematic, even though currenly guests don't assign such values normally. I know it's a qmp bug/feature, but you don't have to use qmp.
Sure, no problem, I'll take care of it.
+{
- PciRangeEntry *entry, *next, *e;
- if (!base) {
return;
- }
- e = g_malloc(sizeof(*entry));
- e->base = base;
- e->limit = limit;
- if (QLIST_EMPTY(list)) {
QLIST_INSERT_HEAD(list, e, entry);
- } else {
QLIST_FOREACH_SAFE(entry, list, entry, next) {
if (base < entry->base) {
QLIST_INSERT_BEFORE(entry, e, entry);
break;
} else if (!next) {
QLIST_INSERT_AFTER(entry, e, entry);
break;
}
}
- }
+}
+static void pci_range_list_free(PciRangeQ *list) +{
- PciRangeEntry *entry, *next;
- QLIST_FOREACH_SAFE(entry, list, entry, next) {
QLIST_REMOVE(entry, entry);
g_free(entry);
- }
+}
Not very happy about manual memory management here. Isn't there something you can do with
The context is pretty simple: 1. We create the ranges and add them to the lists as we go. 2. We use the lists to create the aml IO ranges. 3. Delete them once we finish. All that in the same context (the same chunk of code) It seems pretty straight forward to me.
And how about using g_array_sort to sort things?
We can, but the code simply inserts the range into the appropriate position, why using g_array_sort be better? Maybe we can find another reason to use g_array, we can leverage this method. I thought about it and I had a problem with g_array, however I don't remember it now.
And BTW manual memory management, we will need to create and destroy the g_array.
+static Aml *build_crs(PcPciInfo *pci, PciInfo *bus_info,
PciRangeQ *io_ranges, PciRangeQ *mem_ranges)
+{
- PciDeviceInfoList *dev_list;
- PciMemoryRange range;
- uint8_t max_bus;
- Aml *crs;
- crs = aml_resource_template();
- max_bus = bus_info->bus;
- for (dev_list = bus_info->devices; dev_list; dev_list = dev_list->next) {
PciMemoryRegionList *region;
for (region = dev_list->value->regions; region; region = region->next) {
range.base = region->value->address;
range.limit = region->value->address + region->value->size - 1;
if (!strcmp(region->value->type, "io")) {
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0,
range.base,
range.limit,
0,
range.limit - range.base + 1));
pci_range_insert(io_ranges, range.base, range.limit);
} else { /* "memory" */
aml_append(crs,
aml_dword_memory(aml_pos_decode, aml_min_fixed,
aml_max_fixed, aml_non_cacheable,
aml_ReadWrite,
0,
range.base,
range.limit,
0,
range.limit - range.base + 1));
pci_range_insert(mem_ranges, range.base, range.limit);
}
}
if (dev_list->value->has_pci_bridge) {
PciBridgeInfo *bridge_info = dev_list->value->pci_bridge;
if (bridge_info->bus.subordinate > max_bus) {
What's this doing?
It keeps track of the max bus number for piix hostbridge bus num range. (see [1] below) Pci root bus 0 range is [0 - <minimmum bus number used by other pci root buses> Maybe the variable name is not so good, I am open to suggestions: min_extra_root_bus_nr?
max_bus = bridge_info->bus.subordinate;
}
range = *bridge_info->bus.io_range;
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0,
range.base,
range.limit,
0,
range.limit - range.base + 1));
pci_range_insert(io_ranges, range.base, range.limit);
range = *bridge_info->bus.memory_range;
aml_append(crs,
aml_dword_memory(aml_pos_decode, aml_min_fixed,
aml_max_fixed, aml_non_cacheable,
aml_ReadWrite,
0,
range.base,
range.limit,
0,
range.limit - range.base + 1));
pci_range_insert(mem_ranges, range.base, range.limit);
range = *bridge_info->bus.prefetchable_range;
aml_append(crs,
aml_dword_memory(aml_pos_decode, aml_min_fixed,
aml_max_fixed, aml_non_cacheable,
aml_ReadWrite,
0,
range.base,
range.limit,
0,
range.limit - range.base + 1));
pci_range_insert(mem_ranges, range.base, range.limit);
}
- }
- aml_append(crs,
aml_word_bus_number(aml_min_fixed, aml_max_fixed, aml_pos_decode,
0,
bus_info->bus,
max_bus,
0,
max_bus - bus_info->bus + 1));
[1] The bus numbers range for piix host-bridge (bus 0).
Thanks, Marcel
- return crs;
+}
- static void build_ssdt(GArray *table_data, GArray *linker, AcpiCpuInfo *cpu, AcpiPmInfo *pm, AcpiMiscInfo *misc,
@@ -737,6 +879,8 @@ build_ssdt(GArray *table_data, GArray *linker, uint32_t nr_mem = machine->ram_slots; unsigned acpi_cpus = guest_info->apic_id_limit; Aml *ssdt, *sb_scope, *scope, *pkg, *dev, *method, *crs, *field, *ifctx;
PciRangeQ io_ranges = QLIST_HEAD_INITIALIZER(io_ranges);
PciRangeQ mem_ranges = QLIST_HEAD_INITIALIZER(mem_ranges); int i;
ssdt = init_aml_allocator();
@@ -773,9 +917,14 @@ build_ssdt(GArray *table_data, GArray *linker, aml_append(dev, aml_name_decl("_BBN", aml_int((uint8_t)bus_info->bus))); aml_append(dev, build_prt());
crs = build_crs(pci, bus_info, &io_ranges, &mem_ranges);
aml_append(dev, aml_name_decl("_CRS", crs)); aml_append(scope, dev); aml_append(ssdt, scope); }
pci_range_list_free(&io_ranges);
pci_range_list_free(&mem_ranges); qapi_free_PciInfoList(info_list); }
-- 2.1.0
If multiple root busses are used, root bus 0 cannot use all the pci holes ranges. Remove the IO/mem ranges used by the other primary busses.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/i386/acpi-build.c | 84 ++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 72 insertions(+), 12 deletions(-)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index f4d8816..44819b8 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -881,6 +881,9 @@ build_ssdt(GArray *table_data, GArray *linker, Aml *ssdt, *sb_scope, *scope, *pkg, *dev, *method, *crs, *field, *ifctx; PciRangeQ io_ranges = QLIST_HEAD_INITIALIZER(io_ranges); PciRangeQ mem_ranges = QLIST_HEAD_INITIALIZER(mem_ranges); + PciMemoryRange range; + PciRangeEntry *entry; + int root_bus_limit = 0xFF; int i;
ssdt = init_aml_allocator(); @@ -909,6 +912,10 @@ build_ssdt(GArray *table_data, GArray *linker, continue; }
+ if (bus_info->bus < root_bus_limit) { + root_bus_limit = bus_info->bus - 1; + } + scope = aml_scope("\_SB"); dev = aml_device("PC%.02X", (uint8_t)bus_info->bus); aml_append(dev, aml_name_decl("_UID", @@ -923,8 +930,6 @@ build_ssdt(GArray *table_data, GArray *linker, aml_append(ssdt, scope); }
- pci_range_list_free(&io_ranges); - pci_range_list_free(&mem_ranges); qapi_free_PciInfoList(info_list); }
@@ -933,26 +938,78 @@ build_ssdt(GArray *table_data, GArray *linker, crs = aml_resource_template(); aml_append(crs, aml_word_bus_number(aml_min_fixed, aml_max_fixed, aml_pos_decode, - 0x0000, 0x0000, 0x00FF, 0x0000, 0x0100)); + 0x0000, 0x0, root_bus_limit, + 0x0000, root_bus_limit + 1)); aml_append(crs, aml_io(aml_decode16, 0x0CF8, 0x0CF8, 0x01, 0x08));
aml_append(crs, aml_word_io(aml_min_fixed, aml_max_fixed, aml_pos_decode, aml_entire_range, 0x0000, 0x0000, 0x0CF7, 0x0000, 0x0CF8)); - aml_append(crs, - aml_word_io(aml_min_fixed, aml_max_fixed, - aml_pos_decode, aml_entire_range, - 0x0000, 0x0D00, 0xFFFF, 0x0000, 0xF300)); + + /* prepare PCI IO ranges */ + range.base = 0x0D00; + range.limit = 0xFFFF; + if (QLIST_EMPTY(&io_ranges)) { + aml_append(crs, + aml_word_io(aml_min_fixed, aml_max_fixed, + aml_pos_decode, aml_entire_range, + 0x0000, range.base, range.limit, + 0x0000, range.limit - range.base + 1)); + } else { + QLIST_FOREACH(entry, &io_ranges, entry) { + if (range.base < entry->base) { + aml_append(crs, + aml_word_io(aml_min_fixed, aml_max_fixed, + aml_pos_decode, aml_entire_range, + 0x0000, range.base, entry->base - 1, + 0x0000, entry->base - range.base)); + } + range.base = entry->limit + 1; + if (!QLIST_NEXT(entry, entry)) { + aml_append(crs, + aml_word_io(aml_min_fixed, aml_max_fixed, + aml_pos_decode, aml_entire_range, + 0x0000, range.base, range.limit, + 0x0000, range.limit - range.base + 1)); + } + } + } + aml_append(crs, aml_dword_memory(aml_pos_decode, aml_min_fixed, aml_max_fixed, aml_cacheable, aml_ReadWrite, 0, 0x000A0000, 0x000BFFFF, 0, 0x00020000)); - aml_append(crs, - aml_dword_memory(aml_pos_decode, aml_min_fixed, aml_max_fixed, - aml_non_cacheable, aml_ReadWrite, - 0, pci->w32.begin, pci->w32.end - 1, 0, - pci->w32.end - pci->w32.begin)); + + /* prepare PCI memory ranges */ + range.base = pci->w32.begin; + range.limit = pci->w32.end - 1; + if (QLIST_EMPTY(&mem_ranges)) { + aml_append(crs, + aml_dword_memory(aml_pos_decode, aml_min_fixed, aml_max_fixed, + aml_non_cacheable, aml_ReadWrite, + 0, range.base, range.limit, + 0, range.limit - range.base + 1)); + } else { + QLIST_FOREACH(entry, &mem_ranges, entry) { + if (range.base < entry->base) { + aml_append(crs, + aml_dword_memory(aml_pos_decode, aml_min_fixed, aml_max_fixed, + aml_non_cacheable, aml_ReadWrite, + 0, range.base, entry->base - 1, + 0, entry->base - range.base)); + } + range.base = entry->limit + 1; + if (!QLIST_NEXT(entry, entry)) { + aml_append(crs, + aml_dword_memory(aml_pos_decode, aml_min_fixed, aml_max_fixed, + aml_non_cacheable, aml_ReadWrite, + 0, range.base, range.limit, + 0, range.base - range.limit + 1)); + } + } + } + if (pci->w64.begin) { aml_append(crs, aml_qword_memory(aml_pos_decode, aml_min_fixed, aml_max_fixed, @@ -975,6 +1032,9 @@ build_ssdt(GArray *table_data, GArray *linker, aml_append(dev, aml_name_decl("_CRS", crs)); aml_append(scope, dev);
+ pci_range_list_free(&io_ranges); + pci_range_list_free(&mem_ranges); + /* reserve PCIHP resources */ if (pm->pcihp_io_len) { dev = aml_device("PHPR");
On Sun, Mar 08, 2015 at 01:16:15PM +0200, Marcel Apfelbaum wrote:
If multiple root busses are used, root bus 0 cannot use all the pci holes ranges. Remove the IO/mem ranges used by the other primary busses.
[...]
- aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0x0000, 0x0D00, 0xFFFF, 0x0000, 0xF300));
- /* prepare PCI IO ranges */
- range.base = 0x0D00;
- range.limit = 0xFFFF;
- if (QLIST_EMPTY(&io_ranges)) {
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0x0000, range.base, range.limit,
0x0000, range.limit - range.base + 1));
- } else {
QLIST_FOREACH(entry, &io_ranges, entry) {
if (range.base < entry->base) {
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0x0000, range.base, entry->base - 1,
0x0000, entry->base - range.base));
}
range.base = entry->limit + 1;
if (!QLIST_NEXT(entry, entry)) {
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0x0000, range.base, range.limit,
0x0000, range.limit - range.base + 1));
}
}
- }
If I read this correctly, it looks like a machine with two root buses and 20 devices, each with one memory range and one io range, would end up with 40 CRS ranges (ie, a CRS range for every resource). It also looks like this furthers the requirement that the guest firmware assign the PCI resources prior to QEMU being able to generate the ACPI tables.
Am I correct? If so, that doesn't sound ideal.
-Kevin
On 03/08/2015 06:13 PM, Kevin O'Connor wrote:
On Sun, Mar 08, 2015 at 01:16:15PM +0200, Marcel Apfelbaum wrote:
If multiple root busses are used, root bus 0 cannot use all the pci holes ranges. Remove the IO/mem ranges used by the other primary busses.
[...]
- aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0x0000, 0x0D00, 0xFFFF, 0x0000, 0xF300));
- /* prepare PCI IO ranges */
- range.base = 0x0D00;
- range.limit = 0xFFFF;
- if (QLIST_EMPTY(&io_ranges)) {
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0x0000, range.base, range.limit,
0x0000, range.limit - range.base + 1));
- } else {
QLIST_FOREACH(entry, &io_ranges, entry) {
if (range.base < entry->base) {
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0x0000, range.base, entry->base - 1,
0x0000, entry->base - range.base));
}
range.base = entry->limit + 1;
if (!QLIST_NEXT(entry, entry)) {
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0x0000, range.base, range.limit,
0x0000, range.limit - range.base + 1));
}
}
- }
Hi Kevin, Thank you for your review!
If I read this correctly, it looks like a machine with two root buses and 20 devices, each with one memory range and one io range, would end up with 40 CRS ranges (ie, a CRS range for every resource).
Correct.
As Michael pointed out in another thread, the firmware is considered guest code and QEMU cannot assume anything on how the resources are assigned. This is why this solution was chosen.
However we have two things that make the situation a little better. 1. The PXB implementation includes a pci-bridge and all devices are automatically attached to the secondary bus, in this way we have one IO/MEM range per extra root bus. 2. On top of this series we can add a merge algorithm that will bring together consecutive ranges. This series does not include this optimization and it focuses on the correctness.
It also
looks like this furthers the requirement that the guest firmware assign the PCI resources prior to QEMU being able to generate the ACPI tables.
Am I correct? If so, that doesn't sound ideal.
You are correct, however is not that bad because we have the following sequence: - Early in the boot sequence the bios scans the PCI buses and assigns IO/MEM ranges - At this moment all resources needed by QEMU are present in the configuration space. - At the end of the boot sequence the BIOS queries the ACPI tables and *only then* the tables are computed.
I think we use that implicitly for other features, anyway, it looks like an elegant solution with no real drawbacks. (Our assumptions are safe)
Thanks, Marcel
-Kevin
On Sun, Mar 08, 2015 at 07:51:42PM +0200, Marcel Apfelbaum wrote:
On 03/08/2015 06:13 PM, Kevin O'Connor wrote:
If I read this correctly, it looks like a machine with two root buses and 20 devices, each with one memory range and one io range, would end up with 40 CRS ranges (ie, a CRS range for every resource).
Correct.
As Michael pointed out in another thread, the firmware is considered guest code and QEMU cannot assume anything on how the resources are assigned. This is why this solution was chosen.
However we have two things that make the situation a little better.
- The PXB implementation includes a pci-bridge and all devices are automatically attached to the secondary bus, in this way we have one IO/MEM range per extra root bus.
Out of curiosity, does the PXB implementation add the pci-bridge just to simplify the IO/MEM range, or are there other technical reasons for it?
- On top of this series we can add a merge algorithm that will bring together consecutive ranges. This series does not include this optimization and it focuses on the correctness.
It also
looks like this furthers the requirement that the guest firmware assign the PCI resources prior to QEMU being able to generate the ACPI tables.
Am I correct? If so, that doesn't sound ideal.
You are correct, however is not that bad because we have the following sequence:
- Early in the boot sequence the bios scans the PCI buses and assigns IO/MEM ranges
- At this moment all resources needed by QEMU are present in the configuration space.
- At the end of the boot sequence the BIOS queries the ACPI tables and *only then* the tables are computed.
I think we use that implicitly for other features, anyway, it looks like an elegant solution with no real drawbacks. (Our assumptions are safe)
Thank you for the clarification. I understand that it works, but I've never been that comfortable with the QEMU<->firmware dance with PCI resources. I do understand that the alternatives have as many or more problems though. So, I'm not objecting to this implementation.
Cheers, -Kevin
On 03/08/2015 08:26 PM, Kevin O'Connor wrote:
On Sun, Mar 08, 2015 at 07:51:42PM +0200, Marcel Apfelbaum wrote:
On 03/08/2015 06:13 PM, Kevin O'Connor wrote:
If I read this correctly, it looks like a machine with two root buses and 20 devices, each with one memory range and one io range, would end up with 40 CRS ranges (ie, a CRS range for every resource).
Correct.
As Michael pointed out in another thread, the firmware is considered guest code and QEMU cannot assume anything on how the resources are assigned. This is why this solution was chosen.
However we have two things that make the situation a little better.
- The PXB implementation includes a pci-bridge and all devices are automatically attached to the secondary bus, in this way we have one IO/MEM range per extra root bus.
Out of curiosity, does the PXB implementation add the pci-bridge just to simplify the IO/MEM range, or are there other technical reasons for it?
We have another elephant there :) -> pci hotplug. All the "free" memory ranges are assigned to bus 0, this will leave the pxb buses without the hotplug capability. Using a PCI bridge will give us some IO/MEM ranges for hotplug: the ones created because of minimum requirement by PCI spec and not used currently by any devices.
On top of this series we can add a merge algorithm that will bring together consecutive ranges. This series does not include this optimization and it focuses on the correctness.
It also
looks like this furthers the requirement that the guest firmware assign the PCI resources prior to QEMU being able to generate the ACPI tables.
Am I correct? If so, that doesn't sound ideal.
You are correct, however is not that bad because we have the following sequence:
- Early in the boot sequence the bios scans the PCI buses and assigns IO/MEM ranges
- At this moment all resources needed by QEMU are present in the configuration space.
- At the end of the boot sequence the BIOS queries the ACPI tables and *only then* the tables are computed.
I think we use that implicitly for other features, anyway, it looks like an elegant solution with no real drawbacks. (Our assumptions are safe)
Thank you for the clarification. I understand that it works, but I've never been that comfortable with the QEMU<->firmware dance with PCI resources. I do understand that the alternatives have as many or more problems though. So, I'm not objecting to this implementation.
No problem, thank you and your review is much appreciated as always, Marcel
Cheers, -Kevin
On Sun, Mar 08, 2015 at 12:13:40PM -0400, Kevin O'Connor wrote:
On Sun, Mar 08, 2015 at 01:16:15PM +0200, Marcel Apfelbaum wrote:
If multiple root busses are used, root bus 0 cannot use all the pci holes ranges. Remove the IO/mem ranges used by the other primary busses.
[...]
- aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0x0000, 0x0D00, 0xFFFF, 0x0000, 0xF300));
- /* prepare PCI IO ranges */
- range.base = 0x0D00;
- range.limit = 0xFFFF;
- if (QLIST_EMPTY(&io_ranges)) {
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0x0000, range.base, range.limit,
0x0000, range.limit - range.base + 1));
- } else {
QLIST_FOREACH(entry, &io_ranges, entry) {
if (range.base < entry->base) {
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0x0000, range.base, entry->base - 1,
0x0000, entry->base - range.base));
}
range.base = entry->limit + 1;
if (!QLIST_NEXT(entry, entry)) {
aml_append(crs,
aml_word_io(aml_min_fixed, aml_max_fixed,
aml_pos_decode, aml_entire_range,
0x0000, range.base, range.limit,
0x0000, range.limit - range.base + 1));
}
}
- }
If I read this correctly, it looks like a machine with two root buses and 20 devices, each with one memory range and one io range, would end up with 40 CRS ranges (ie, a CRS range for every resource).
I think that's only if you stick multiple devices directly behind the bridge. Looks like with a single pci bridge behind root, there will only be 2 ranges.
Maybe try to enforce this sane topology?
It also looks like this furthers the requirement that the guest firmware assign the PCI resources prior to QEMU being able to generate the ACPI tables.
That seems unavoidable unless we want to assign ranges from hardware/management. Which I think would be a mistake: management doesn't really know, or care.
-Kevin
On Sun, Mar 08, 2015 at 07:34:34PM +0100, Michael S. Tsirkin wrote:
On Sun, Mar 08, 2015 at 12:13:40PM -0400, Kevin O'Connor wrote:
If I read this correctly, it looks like a machine with two root buses and 20 devices, each with one memory range and one io range, would end up with 40 CRS ranges (ie, a CRS range for every resource).
I think that's only if you stick multiple devices directly behind the bridge. Looks like with a single pci bridge behind root, there will only be 2 ranges.
Yeah, that makes sense, so doesn't seem to be a problem.
Maybe try to enforce this sane topology?
It also looks like this furthers the requirement that the guest firmware assign the PCI resources prior to QEMU being able to generate the ACPI tables.
That seems unavoidable unless we want to assign ranges from hardware/management. Which I think would be a mistake: management doesn't really know, or care.
I understand. I think what would help me is if we could document somewhere that the firmware has to assign PCI resources before querying the bios tables and that it is the *only* pre-requisite for querying them. Looking now, though, I don't see any fw_cfg documentation in the repo, so I'm not sure where that could be added.
Thanks, -Kevin
On Sun, Mar 08, 2015 at 02:46:28PM -0400, Kevin O'Connor wrote:
On Sun, Mar 08, 2015 at 07:34:34PM +0100, Michael S. Tsirkin wrote:
On Sun, Mar 08, 2015 at 12:13:40PM -0400, Kevin O'Connor wrote:
If I read this correctly, it looks like a machine with two root buses and 20 devices, each with one memory range and one io range, would end up with 40 CRS ranges (ie, a CRS range for every resource).
I think that's only if you stick multiple devices directly behind the bridge. Looks like with a single pci bridge behind root, there will only be 2 ranges.
Yeah, that makes sense, so doesn't seem to be a problem.
Maybe try to enforce this sane topology?
It also looks like this furthers the requirement that the guest firmware assign the PCI resources prior to QEMU being able to generate the ACPI tables.
That seems unavoidable unless we want to assign ranges from hardware/management. Which I think would be a mistake: management doesn't really know, or care.
I understand. I think what would help me is if we could document somewhere that the firmware has to assign PCI resources before querying the bios tables and that it is the *only* pre-requisite for querying them. Looking now, though, I don't see any fw_cfg documentation in the repo, so I'm not sure where that could be added.
Thanks, -Kevin
Sigh. Might make a GSoC project?
On Mon, Mar 09, 2015 at 09:44:24AM +0100, Michael S. Tsirkin wrote:
On Sun, Mar 08, 2015 at 02:46:28PM -0400, Kevin O'Connor wrote:
On Sun, Mar 08, 2015 at 07:34:34PM +0100, Michael S. Tsirkin wrote:
On Sun, Mar 08, 2015 at 12:13:40PM -0400, Kevin O'Connor wrote:
If I read this correctly, it looks like a machine with two root buses and 20 devices, each with one memory range and one io range, would end up with 40 CRS ranges (ie, a CRS range for every resource).
I think that's only if you stick multiple devices directly behind the bridge. Looks like with a single pci bridge behind root, there will only be 2 ranges.
Yeah, that makes sense, so doesn't seem to be a problem.
Maybe try to enforce this sane topology?
It also looks like this furthers the requirement that the guest firmware assign the PCI resources prior to QEMU being able to generate the ACPI tables.
That seems unavoidable unless we want to assign ranges from hardware/management. Which I think would be a mistake: management doesn't really know, or care.
I understand. I think what would help me is if we could document somewhere that the firmware has to assign PCI resources before querying the bios tables and that it is the *only* pre-requisite for querying them. Looking now, though, I don't see any fw_cfg documentation in the repo, so I'm not sure where that could be added.
Thanks, -Kevin
Sigh. Might make a GSoC project?
Documentation projects are not possible under Google Summer of Code:
https://www.google-melange.com/gsoc/document/show/gsoc_program/google/gsoc20...
If there is a coding project you are interested in mentoring, there is a project idea template to fill out here:
http://qemu-project.org/Google_Summer_of_Code_2015#Project_idea_template
Stefan
This refactoring moves all the code needed (recursively) to register TYPE_PCI_BUS type to a new file hw/pci/pci_bus.c . This allows to properly add new functionality to the pci bus class.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- arch_init.c | 1 + hw/alpha/typhoon.c | 1 + hw/mips/gt64xxx_pci.c | 1 + hw/pci-host/bonito.c | 1 + hw/pci-host/grackle.c | 1 + hw/pci-host/piix.c | 1 + hw/pci-host/ppce500.c | 1 + hw/pci-host/q35.c | 1 + hw/pci-host/uninorth.c | 1 + hw/pci/Makefile.objs | 2 +- hw/pci/pci.c | 472 +-------------------------------------------- hw/pci/pci_bus.c | 491 +++++++++++++++++++++++++++++++++++++++++++++++ hw/ppc/ppc4xx_pci.c | 1 + hw/sh4/r2d.c | 1 + hw/sh4/sh_pci.c | 1 + include/hw/pci/pci.h | 3 +- include/hw/pci/pci_bus.h | 8 + 17 files changed, 514 insertions(+), 474 deletions(-) create mode 100644 hw/pci/pci_bus.c
diff --git a/arch_init.c b/arch_init.c index 691b5e2..cbebf6d 100644 --- a/arch_init.c +++ b/arch_init.c @@ -37,6 +37,7 @@ #include "audio/audio.h" #include "hw/i386/pc.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "hw/audio/audio.h" #include "sysemu/kvm.h" #include "migration/migration.h" diff --git a/hw/alpha/typhoon.c b/hw/alpha/typhoon.c index 62af946..d3fad5d 100644 --- a/hw/alpha/typhoon.c +++ b/hw/alpha/typhoon.c @@ -8,6 +8,7 @@
#include "cpu.h" #include "hw/hw.h" +#include "hw/pci/pci_bus.h" #include "hw/devices.h" #include "sysemu/sysemu.h" #include "alpha_sys.h" diff --git a/hw/mips/gt64xxx_pci.c b/hw/mips/gt64xxx_pci.c index 10fcca3..c4ca6f3 100644 --- a/hw/mips/gt64xxx_pci.c +++ b/hw/mips/gt64xxx_pci.c @@ -25,6 +25,7 @@ #include "hw/hw.h" #include "hw/mips/mips.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "hw/pci/pci_host.h" #include "hw/i386/pc.h" #include "exec/address-spaces.h" diff --git a/hw/pci-host/bonito.c b/hw/pci-host/bonito.c index 8bdd569..49e1122 100644 --- a/hw/pci-host/bonito.c +++ b/hw/pci-host/bonito.c @@ -41,6 +41,7 @@
#include "hw/hw.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "hw/i386/pc.h" #include "hw/mips/mips.h" #include "hw/pci/pci_host.h" diff --git a/hw/pci-host/grackle.c b/hw/pci-host/grackle.c index bfe707a..89745cd 100644 --- a/hw/pci-host/grackle.c +++ b/hw/pci-host/grackle.c @@ -26,6 +26,7 @@ #include "hw/pci/pci_host.h" #include "hw/ppc/mac.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h"
/* debug Grackle */ //#define DEBUG_GRACKLE diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index 723836f..292b6e9 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -25,6 +25,7 @@ #include "hw/hw.h" #include "hw/i386/pc.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "hw/pci/pci_host.h" #include "hw/isa/isa.h" #include "hw/sysbus.h" diff --git a/hw/pci-host/ppce500.c b/hw/pci-host/ppce500.c index 613ba73..4363951 100644 --- a/hw/pci-host/ppce500.c +++ b/hw/pci-host/ppce500.c @@ -17,6 +17,7 @@ #include "hw/hw.h" #include "hw/ppc/e500-ccsr.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "hw/pci/pci_host.h" #include "qemu/bswap.h" #include "hw/pci-host/ppce500.h" diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c index df60e61..a42d0fb 100644 --- a/hw/pci-host/q35.c +++ b/hw/pci-host/q35.c @@ -28,6 +28,7 @@ * THE SOFTWARE. */ #include "hw/hw.h" +#include "hw/pci/pci_bus.h" #include "hw/pci-host/q35.h" #include "qapi/visitor.h"
diff --git a/hw/pci-host/uninorth.c b/hw/pci-host/uninorth.c index 53f2b59..1406b42 100644 --- a/hw/pci-host/uninorth.c +++ b/hw/pci-host/uninorth.c @@ -24,6 +24,7 @@ #include "hw/hw.h" #include "hw/ppc/mac.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "hw/pci/pci_host.h"
/* debug UniNorth */ diff --git a/hw/pci/Makefile.objs b/hw/pci/Makefile.objs index 9f905e6..a05cca0 100644 --- a/hw/pci/Makefile.objs +++ b/hw/pci/Makefile.objs @@ -1,4 +1,4 @@ -common-obj-$(CONFIG_PCI) += pci.o pci_bridge.o +common-obj-$(CONFIG_PCI) += pci.o pci_bridge.o pci_bus.o common-obj-$(CONFIG_PCI) += msix.o msi.o common-obj-$(CONFIG_PCI) += shpc.o common-obj-$(CONFIG_PCI) += slotid_cap.o diff --git a/hw/pci/pci.c b/hw/pci/pci.c index cc5d946..442d209 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -45,11 +45,6 @@ # define PCI_DPRINTF(format, ...) do { } while (0) #endif
-static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent); -static char *pcibus_get_dev_path(DeviceState *dev); -static char *pcibus_get_fw_dev_path(DeviceState *dev); -static void pcibus_reset(BusState *qbus); - static Property pci_props[] = { DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), DEFINE_PROP_STRING("romfile", PCIDevice, romfile), @@ -61,58 +56,11 @@ static Property pci_props[] = { DEFINE_PROP_END_OF_LIST() };
-static const VMStateDescription vmstate_pcibus = { - .name = "PCIBUS", - .version_id = 1, - .minimum_version_id = 1, - .fields = (VMStateField[]) { - VMSTATE_INT32_EQUAL(nirq, PCIBus), - VMSTATE_VARRAY_INT32(irq_count, PCIBus, - nirq, 0, vmstate_info_int32, - int32_t), - VMSTATE_END_OF_LIST() - } -}; - -static void pci_bus_realize(BusState *qbus, Error **errp) -{ - PCIBus *bus = PCI_BUS(qbus); - - vmstate_register(NULL, -1, &vmstate_pcibus, bus); -} - -static void pci_bus_unrealize(BusState *qbus, Error **errp) -{ - PCIBus *bus = PCI_BUS(qbus); - - vmstate_unregister(NULL, &vmstate_pcibus, bus); -} - -static void pci_bus_class_init(ObjectClass *klass, void *data) -{ - BusClass *k = BUS_CLASS(klass); - - k->print_dev = pcibus_dev_print; - k->get_dev_path = pcibus_get_dev_path; - k->get_fw_dev_path = pcibus_get_fw_dev_path; - k->realize = pci_bus_realize; - k->unrealize = pci_bus_unrealize; - k->reset = pcibus_reset; -} - -static const TypeInfo pci_bus_info = { - .name = TYPE_PCI_BUS, - .parent = TYPE_BUS, - .instance_size = sizeof(PCIBus), - .class_init = pci_bus_class_init, -}; - static const TypeInfo pcie_bus_info = { .name = TYPE_PCIE_BUS, .parent = TYPE_PCI_BUS, };
-static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num); static void pci_update_mappings(PCIDevice *d); static void pci_irq_handler(void *opaque, int irq_num, int level); static void pci_add_option_rom(PCIDevice *pdev, bool is_default_rom, Error **); @@ -185,7 +133,7 @@ void pci_device_deassert_intx(PCIDevice *dev) } }
-static void pci_do_device_reset(PCIDevice *dev) +void pci_do_device_reset(PCIDevice *dev) { int r;
@@ -230,27 +178,6 @@ void pci_device_reset(PCIDevice *dev) pci_do_device_reset(dev); }
-/* - * Trigger pci bus reset under a given bus. - * Called via qbus_reset_all on RST# assert, after the devices - * have been reset qdev_reset_all-ed already. - */ -static void pcibus_reset(BusState *qbus) -{ - PCIBus *bus = DO_UPCAST(PCIBus, qbus, qbus); - int i; - - for (i = 0; i < ARRAY_SIZE(bus->devices); ++i) { - if (bus->devices[i]) { - pci_do_device_reset(bus->devices[i]); - } - } - - for (i = 0; i < bus->nirq; i++) { - assert(bus->irq_count[i] == 0); - } -} - static void pci_host_bus_register(PCIBus *bus, DeviceState *parent) { PCIHostState *host_bridge = PCI_HOST_BRIDGE(parent); @@ -1301,74 +1228,6 @@ int pci_swizzle_map_irq_fn(PCIDevice *pci_dev, int pin) return (pin + PCI_SLOT(pci_dev->devfn)) % PCI_NUM_PINS; }
-/***********************************************************/ -/* monitor info on PCI */ - -typedef struct { - uint16_t class; - const char *desc; - const char *fw_name; - uint16_t fw_ign_bits; -} pci_class_desc; - -static const pci_class_desc pci_class_descriptions[] = -{ - { 0x0001, "VGA controller", "display"}, - { 0x0100, "SCSI controller", "scsi"}, - { 0x0101, "IDE controller", "ide"}, - { 0x0102, "Floppy controller", "fdc"}, - { 0x0103, "IPI controller", "ipi"}, - { 0x0104, "RAID controller", "raid"}, - { 0x0106, "SATA controller"}, - { 0x0107, "SAS controller"}, - { 0x0180, "Storage controller"}, - { 0x0200, "Ethernet controller", "ethernet"}, - { 0x0201, "Token Ring controller", "token-ring"}, - { 0x0202, "FDDI controller", "fddi"}, - { 0x0203, "ATM controller", "atm"}, - { 0x0280, "Network controller"}, - { 0x0300, "VGA controller", "display", 0x00ff}, - { 0x0301, "XGA controller"}, - { 0x0302, "3D controller"}, - { 0x0380, "Display controller"}, - { 0x0400, "Video controller", "video"}, - { 0x0401, "Audio controller", "sound"}, - { 0x0402, "Phone"}, - { 0x0403, "Audio controller", "sound"}, - { 0x0480, "Multimedia controller"}, - { 0x0500, "RAM controller", "memory"}, - { 0x0501, "Flash controller", "flash"}, - { 0x0580, "Memory controller"}, - { 0x0600, "Host bridge", "host"}, - { 0x0601, "ISA bridge", "isa"}, - { 0x0602, "EISA bridge", "eisa"}, - { 0x0603, "MC bridge", "mca"}, - { 0x0604, "PCI bridge", "pci-bridge"}, - { 0x0605, "PCMCIA bridge", "pcmcia"}, - { 0x0606, "NUBUS bridge", "nubus"}, - { 0x0607, "CARDBUS bridge", "cardbus"}, - { 0x0608, "RACEWAY bridge"}, - { 0x0680, "Bridge"}, - { 0x0700, "Serial port", "serial"}, - { 0x0701, "Parallel port", "parallel"}, - { 0x0800, "Interrupt controller", "interrupt-controller"}, - { 0x0801, "DMA controller", "dma-controller"}, - { 0x0802, "Timer", "timer"}, - { 0x0803, "RTC", "rtc"}, - { 0x0900, "Keyboard", "keyboard"}, - { 0x0901, "Pen", "pen"}, - { 0x0902, "Mouse", "mouse"}, - { 0x0A00, "Dock station", "dock", 0x00ff}, - { 0x0B00, "i386 cpu", "cpu", 0x00ff}, - { 0x0c00, "Fireware contorller", "fireware"}, - { 0x0c01, "Access bus controller", "access-bus"}, - { 0x0c02, "SSA controller", "ssa"}, - { 0x0c03, "USB controller", "usb"}, - { 0x0c04, "Fibre channel controller", "fibre-channel"}, - { 0x0c05, "SMBus"}, - { 0, NULL} -}; - static void pci_for_each_device_under_bus(PCIBus *bus, void (*fn)(PCIBus *b, PCIDevice *d, void *opaque), @@ -1396,161 +1255,6 @@ void pci_for_each_device(PCIBus *bus, int bus_num, } }
-static const pci_class_desc *get_class_desc(int class) -{ - const pci_class_desc *desc; - - desc = pci_class_descriptions; - while (desc->desc && class != desc->class) { - desc++; - } - - return desc; -} - -static PciDeviceInfoList *qmp_query_pci_devices(PCIBus *bus, int bus_num); - -static PciMemoryRegionList *qmp_query_pci_regions(const PCIDevice *dev) -{ - PciMemoryRegionList *head = NULL, *cur_item = NULL; - int i; - - for (i = 0; i < PCI_NUM_REGIONS; i++) { - const PCIIORegion *r = &dev->io_regions[i]; - PciMemoryRegionList *region; - - if (!r->size) { - continue; - } - - region = g_malloc0(sizeof(*region)); - region->value = g_malloc0(sizeof(*region->value)); - - if (r->type & PCI_BASE_ADDRESS_SPACE_IO) { - region->value->type = g_strdup("io"); - } else { - region->value->type = g_strdup("memory"); - region->value->has_prefetch = true; - region->value->prefetch = !!(r->type & PCI_BASE_ADDRESS_MEM_PREFETCH); - region->value->has_mem_type_64 = true; - region->value->mem_type_64 = !!(r->type & PCI_BASE_ADDRESS_MEM_TYPE_64); - } - - region->value->bar = i; - region->value->address = r->addr; - region->value->size = r->size; - - /* XXX: waiting for the qapi to support GSList */ - if (!cur_item) { - head = cur_item = region; - } else { - cur_item->next = region; - cur_item = region; - } - } - - return head; -} - -static PciBridgeInfo *qmp_query_pci_bridge(PCIDevice *dev, PCIBus *bus, - int bus_num) -{ - PciBridgeInfo *info; - - info = g_malloc0(sizeof(*info)); - - info->bus.number = dev->config[PCI_PRIMARY_BUS]; - info->bus.secondary = dev->config[PCI_SECONDARY_BUS]; - info->bus.subordinate = dev->config[PCI_SUBORDINATE_BUS]; - - info->bus.io_range = g_malloc0(sizeof(*info->bus.io_range)); - info->bus.io_range->base = pci_bridge_get_base(dev, PCI_BASE_ADDRESS_SPACE_IO); - info->bus.io_range->limit = pci_bridge_get_limit(dev, PCI_BASE_ADDRESS_SPACE_IO); - - info->bus.memory_range = g_malloc0(sizeof(*info->bus.memory_range)); - info->bus.memory_range->base = pci_bridge_get_base(dev, PCI_BASE_ADDRESS_SPACE_MEMORY); - info->bus.memory_range->limit = pci_bridge_get_limit(dev, PCI_BASE_ADDRESS_SPACE_MEMORY); - - info->bus.prefetchable_range = g_malloc0(sizeof(*info->bus.prefetchable_range)); - info->bus.prefetchable_range->base = pci_bridge_get_base(dev, PCI_BASE_ADDRESS_MEM_PREFETCH); - info->bus.prefetchable_range->limit = pci_bridge_get_limit(dev, PCI_BASE_ADDRESS_MEM_PREFETCH); - - if (dev->config[PCI_SECONDARY_BUS] != 0) { - PCIBus *child_bus = pci_find_bus_nr(bus, dev->config[PCI_SECONDARY_BUS]); - if (child_bus) { - info->has_devices = true; - info->devices = qmp_query_pci_devices(child_bus, dev->config[PCI_SECONDARY_BUS]); - } - } - - return info; -} - -static PciDeviceInfo *qmp_query_pci_device(PCIDevice *dev, PCIBus *bus, - int bus_num) -{ - const pci_class_desc *desc; - PciDeviceInfo *info; - uint8_t type; - int class; - - info = g_malloc0(sizeof(*info)); - info->bus = bus_num; - info->slot = PCI_SLOT(dev->devfn); - info->function = PCI_FUNC(dev->devfn); - - class = pci_get_word(dev->config + PCI_CLASS_DEVICE); - info->class_info.q_class = class; - desc = get_class_desc(class); - if (desc->desc) { - info->class_info.has_desc = true; - info->class_info.desc = g_strdup(desc->desc); - } - - info->id.vendor = pci_get_word(dev->config + PCI_VENDOR_ID); - info->id.device = pci_get_word(dev->config + PCI_DEVICE_ID); - info->regions = qmp_query_pci_regions(dev); - info->qdev_id = g_strdup(dev->qdev.id ? dev->qdev.id : ""); - - if (dev->config[PCI_INTERRUPT_PIN] != 0) { - info->has_irq = true; - info->irq = dev->config[PCI_INTERRUPT_LINE]; - } - - type = dev->config[PCI_HEADER_TYPE] & ~PCI_HEADER_TYPE_MULTI_FUNCTION; - if (type == PCI_HEADER_TYPE_BRIDGE) { - info->has_pci_bridge = true; - info->pci_bridge = qmp_query_pci_bridge(dev, bus, bus_num); - } - - return info; -} - -static PciDeviceInfoList *qmp_query_pci_devices(PCIBus *bus, int bus_num) -{ - PciDeviceInfoList *info, *head = NULL, *cur_item = NULL; - PCIDevice *dev; - int devfn; - - for (devfn = 0; devfn < ARRAY_SIZE(bus->devices); devfn++) { - dev = bus->devices[devfn]; - if (dev) { - info = g_malloc0(sizeof(*info)); - info->value = qmp_query_pci_device(dev, bus, bus_num); - - /* XXX: waiting for the qapi to support GSList */ - if (!cur_item) { - head = cur_item = info; - } else { - cur_item->next = info; - cur_item = info; - } - } - } - - return head; -} - static PciInfo *qmp_query_pci_bus(PCIBus *bus, int bus_num) { PciInfo *info = NULL; @@ -1674,50 +1378,6 @@ PCIDevice *pci_vga_init(PCIBus *bus) } }
-/* Whether a given bus number is in range of the secondary - * bus of the given bridge device. */ -static bool pci_secondary_bus_in_range(PCIDevice *dev, int bus_num) -{ - return !(pci_get_word(dev->config + PCI_BRIDGE_CONTROL) & - PCI_BRIDGE_CTL_BUS_RESET) /* Don't walk the bus if it's reset. */ && - dev->config[PCI_SECONDARY_BUS] < bus_num && - bus_num <= dev->config[PCI_SUBORDINATE_BUS]; -} - -static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num) -{ - PCIBus *sec; - - if (!bus) { - return NULL; - } - - if (pci_bus_num(bus) == bus_num) { - return bus; - } - - /* Consider all bus numbers in range for the host pci bridge. */ - if (!pci_bus_is_root(bus) && - !pci_secondary_bus_in_range(bus->parent_dev, bus_num)) { - return NULL; - } - - /* try child bus */ - for (; bus; bus = sec) { - QLIST_FOREACH(sec, &bus->child, sibling) { - assert(!pci_bus_is_root(sec)); - if (sec->parent_dev->config[PCI_SECONDARY_BUS] == bus_num) { - return sec; - } - if (pci_secondary_bus_in_range(sec->parent_dev, bus_num)) { - break; - } - } - } - - return NULL; -} - void pci_for_each_bus_depth_first(PCIBus *bus, void *(*begin)(PCIBus *bus, void *parent_state), void (*end)(PCIBus *bus, void *state), @@ -2130,135 +1790,6 @@ uint8_t pci_find_capability(PCIDevice *pdev, uint8_t cap_id) return pci_find_capability_list(pdev, cap_id, NULL); }
-static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) -{ - PCIDevice *d = (PCIDevice *)dev; - const pci_class_desc *desc; - char ctxt[64]; - PCIIORegion *r; - int i, class; - - class = pci_get_word(d->config + PCI_CLASS_DEVICE); - desc = pci_class_descriptions; - while (desc->desc && class != desc->class) - desc++; - if (desc->desc) { - snprintf(ctxt, sizeof(ctxt), "%s", desc->desc); - } else { - snprintf(ctxt, sizeof(ctxt), "Class %04x", class); - } - - monitor_printf(mon, "%*sclass %s, addr %02x:%02x.%x, " - "pci id %04x:%04x (sub %04x:%04x)\n", - indent, "", ctxt, pci_bus_num(d->bus), - PCI_SLOT(d->devfn), PCI_FUNC(d->devfn), - pci_get_word(d->config + PCI_VENDOR_ID), - pci_get_word(d->config + PCI_DEVICE_ID), - pci_get_word(d->config + PCI_SUBSYSTEM_VENDOR_ID), - pci_get_word(d->config + PCI_SUBSYSTEM_ID)); - for (i = 0; i < PCI_NUM_REGIONS; i++) { - r = &d->io_regions[i]; - if (!r->size) - continue; - monitor_printf(mon, "%*sbar %d: %s at 0x%"FMT_PCIBUS - " [0x%"FMT_PCIBUS"]\n", - indent, "", - i, r->type & PCI_BASE_ADDRESS_SPACE_IO ? "i/o" : "mem", - r->addr, r->addr + r->size - 1); - } -} - -static char *pci_dev_fw_name(DeviceState *dev, char *buf, int len) -{ - PCIDevice *d = (PCIDevice *)dev; - const char *name = NULL; - const pci_class_desc *desc = pci_class_descriptions; - int class = pci_get_word(d->config + PCI_CLASS_DEVICE); - - while (desc->desc && - (class & ~desc->fw_ign_bits) != - (desc->class & ~desc->fw_ign_bits)) { - desc++; - } - - if (desc->desc) { - name = desc->fw_name; - } - - if (name) { - pstrcpy(buf, len, name); - } else { - snprintf(buf, len, "pci%04x,%04x", - pci_get_word(d->config + PCI_VENDOR_ID), - pci_get_word(d->config + PCI_DEVICE_ID)); - } - - return buf; -} - -static char *pcibus_get_fw_dev_path(DeviceState *dev) -{ - PCIDevice *d = (PCIDevice *)dev; - char path[50], name[33]; - int off; - - off = snprintf(path, sizeof(path), "%s@%x", - pci_dev_fw_name(dev, name, sizeof name), - PCI_SLOT(d->devfn)); - if (PCI_FUNC(d->devfn)) - snprintf(path + off, sizeof(path) + off, ",%x", PCI_FUNC(d->devfn)); - return g_strdup(path); -} - -static char *pcibus_get_dev_path(DeviceState *dev) -{ - PCIDevice *d = container_of(dev, PCIDevice, qdev); - PCIDevice *t; - int slot_depth; - /* Path format: Domain:00:Slot.Function:Slot.Function....:Slot.Function. - * 00 is added here to make this format compatible with - * domain:Bus:Slot.Func for systems without nested PCI bridges. - * Slot.Function list specifies the slot and function numbers for all - * devices on the path from root to the specific device. */ - const char *root_bus_path; - int root_bus_len; - char slot[] = ":SS.F"; - int slot_len = sizeof slot - 1 /* For '\0' */; - int path_len; - char *path, *p; - int s; - - root_bus_path = pci_root_bus_path(d); - root_bus_len = strlen(root_bus_path); - - /* Calculate # of slots on path between device and root. */; - slot_depth = 0; - for (t = d; t; t = t->bus->parent_dev) { - ++slot_depth; - } - - path_len = root_bus_len + slot_len * slot_depth; - - /* Allocate memory, fill in the terminating null byte. */ - path = g_malloc(path_len + 1 /* For '\0' */); - path[path_len] = '\0'; - - memcpy(path, root_bus_path, root_bus_len); - - /* Fill in slot numbers. We walk up from device to root, so need to print - * them in the reverse order, last to first. */ - p = path + path_len; - for (t = d; t; t = t->bus->parent_dev) { - p -= slot_len; - s = snprintf(slot, sizeof slot, ":%02x.%x", - PCI_SLOT(t->devfn), PCI_FUNC(t->devfn)); - assert(s == slot_len); - memcpy(p, slot, slot_len); - } - - return path; -} - static int pci_qdev_find_recursive(PCIBus *bus, const char *id, PCIDevice **pdev) { @@ -2404,7 +1935,6 @@ static const TypeInfo pci_device_type_info = {
static void pci_register_types(void) { - type_register_static(&pci_bus_info); type_register_static(&pcie_bus_info); type_register_static(&pci_device_type_info); } diff --git a/hw/pci/pci_bus.c b/hw/pci/pci_bus.c new file mode 100644 index 0000000..d156194 --- /dev/null +++ b/hw/pci/pci_bus.c @@ -0,0 +1,491 @@ +/* + * PCI Bus + * + * Copyright (C) 2014 Red Hat Inc + * + * Authors: + * Marcel Apfelbaum marcel.a@redhat.com (split out from pci.c) + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci_bridge.h" +#include "monitor/monitor.h" + +typedef struct { + uint16_t class; + const char *desc; + const char *fw_name; + uint16_t fw_ign_bits; +} pci_class_desc; + +static const pci_class_desc pci_class_descriptions[] = { + { 0x0001, "VGA controller", "display"}, + { 0x0100, "SCSI controller", "scsi"}, + { 0x0101, "IDE controller", "ide"}, + { 0x0102, "Floppy controller", "fdc"}, + { 0x0103, "IPI controller", "ipi"}, + { 0x0104, "RAID controller", "raid"}, + { 0x0106, "SATA controller"}, + { 0x0107, "SAS controller"}, + { 0x0180, "Storage controller"}, + { 0x0200, "Ethernet controller", "ethernet"}, + { 0x0201, "Token Ring controller", "token-ring"}, + { 0x0202, "FDDI controller", "fddi"}, + { 0x0203, "ATM controller", "atm"}, + { 0x0280, "Network controller"}, + { 0x0300, "VGA controller", "display", 0x00ff}, + { 0x0301, "XGA controller"}, + { 0x0302, "3D controller"}, + { 0x0380, "Display controller"}, + { 0x0400, "Video controller", "video"}, + { 0x0401, "Audio controller", "sound"}, + { 0x0402, "Phone"}, + { 0x0403, "Audio controller", "sound"}, + { 0x0480, "Multimedia controller"}, + { 0x0500, "RAM controller", "memory"}, + { 0x0501, "Flash controller", "flash"}, + { 0x0580, "Memory controller"}, + { 0x0600, "Host bridge", "host"}, + { 0x0601, "ISA bridge", "isa"}, + { 0x0602, "EISA bridge", "eisa"}, + { 0x0603, "MC bridge", "mca"}, + { 0x0604, "PCI bridge", "pci-bridge"}, + { 0x0605, "PCMCIA bridge", "pcmcia"}, + { 0x0606, "NUBUS bridge", "nubus"}, + { 0x0607, "CARDBUS bridge", "cardbus"}, + { 0x0608, "RACEWAY bridge"}, + { 0x0680, "Bridge"}, + { 0x0700, "Serial port", "serial"}, + { 0x0701, "Parallel port", "parallel"}, + { 0x0800, "Interrupt controller", "interrupt-controller"}, + { 0x0801, "DMA controller", "dma-controller"}, + { 0x0802, "Timer", "timer"}, + { 0x0803, "RTC", "rtc"}, + { 0x0900, "Keyboard", "keyboard"}, + { 0x0901, "Pen", "pen"}, + { 0x0902, "Mouse", "mouse"}, + { 0x0A00, "Dock station", "dock", 0x00ff}, + { 0x0B00, "i386 cpu", "cpu", 0x00ff}, + { 0x0c00, "Fireware contorller", "fireware"}, + { 0x0c01, "Access bus controller", "access-bus"}, + { 0x0c02, "SSA controller", "ssa"}, + { 0x0c03, "USB controller", "usb"}, + { 0x0c04, "Fibre channel controller", "fibre-channel"}, + { 0x0c05, "SMBus"}, + { 0, NULL} +}; + +/* Whether a given bus number is in range of the secondary + * bus of the given bridge device. */ +static bool pci_secondary_bus_in_range(PCIDevice *dev, int bus_num) +{ + return !(pci_get_word(dev->config + PCI_BRIDGE_CONTROL) & + PCI_BRIDGE_CTL_BUS_RESET) /* Don't walk the bus if it's reset. */ && + dev->config[PCI_SECONDARY_BUS] < bus_num && + bus_num <= dev->config[PCI_SUBORDINATE_BUS]; +} + +PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num) +{ + PCIBus *sec; + + if (!bus) { + return NULL; + } + + if (pci_bus_num(bus) == bus_num) { + return bus; + } + + /* Consider all bus numbers in range for the host pci bridge. */ + if (!pci_bus_is_root(bus) && + !pci_secondary_bus_in_range(bus->parent_dev, bus_num)) { + return NULL; + } + + /* try child bus */ + for (; bus; bus = sec) { + QLIST_FOREACH(sec, &bus->child, sibling) { + assert(!pci_bus_is_root(sec)); + if (sec->parent_dev->config[PCI_SECONDARY_BUS] == bus_num) { + return sec; + } + if (pci_secondary_bus_in_range(sec->parent_dev, bus_num)) { + break; + } + } + } + + return NULL; +} + +static const pci_class_desc *get_class_desc(int class) +{ + const pci_class_desc *desc; + + desc = pci_class_descriptions; + while (desc->desc && class != desc->class) { + desc++; + } + + return desc; +} + +static PciMemoryRegionList *qmp_query_pci_regions(const PCIDevice *dev) +{ + PciMemoryRegionList *head = NULL, *cur_item = NULL; + int i; + + for (i = 0; i < PCI_NUM_REGIONS; i++) { + const PCIIORegion *r = &dev->io_regions[i]; + PciMemoryRegionList *region; + + if (!r->size) { + continue; + } + + region = g_malloc0(sizeof(*region)); + region->value = g_malloc0(sizeof(*region->value)); + + if (r->type & PCI_BASE_ADDRESS_SPACE_IO) { + region->value->type = g_strdup("io"); + } else { + region->value->type = g_strdup("memory"); + region->value->has_prefetch = true; + region->value->prefetch = !!(r->type & PCI_BASE_ADDRESS_MEM_PREFETCH); + region->value->has_mem_type_64 = true; + region->value->mem_type_64 = !!(r->type & PCI_BASE_ADDRESS_MEM_TYPE_64); + } + + region->value->bar = i; + region->value->address = r->addr; + region->value->size = r->size; + + /* XXX: waiting for the qapi to support GSList */ + if (!cur_item) { + head = cur_item = region; + } else { + cur_item->next = region; + cur_item = region; + } + } + + return head; +} + +static PciBridgeInfo *qmp_query_pci_bridge(PCIDevice *dev, PCIBus *bus, + int bus_num) +{ + PciBridgeInfo *info; + + info = g_malloc0(sizeof(*info)); + + info->bus.number = dev->config[PCI_PRIMARY_BUS]; + info->bus.secondary = dev->config[PCI_SECONDARY_BUS]; + info->bus.subordinate = dev->config[PCI_SUBORDINATE_BUS]; + + info->bus.io_range = g_malloc0(sizeof(*info->bus.io_range)); + info->bus.io_range->base = pci_bridge_get_base(dev, + PCI_BASE_ADDRESS_SPACE_IO); + info->bus.io_range->limit = + pci_bridge_get_limit(dev, PCI_BASE_ADDRESS_SPACE_IO); + + info->bus.memory_range = g_malloc0(sizeof(*info->bus.memory_range)); + info->bus.memory_range->base = + pci_bridge_get_base(dev, PCI_BASE_ADDRESS_SPACE_MEMORY); + info->bus.memory_range->limit = + pci_bridge_get_limit(dev, PCI_BASE_ADDRESS_SPACE_MEMORY); + + info->bus.prefetchable_range = + g_malloc0(sizeof(*info->bus.prefetchable_range)); + info->bus.prefetchable_range->base = + pci_bridge_get_base(dev, PCI_BASE_ADDRESS_MEM_PREFETCH); + info->bus.prefetchable_range->limit = + pci_bridge_get_limit(dev, PCI_BASE_ADDRESS_MEM_PREFETCH); + + if (dev->config[PCI_SECONDARY_BUS] != 0) { + PCIBus *child_bus = pci_find_bus_nr(bus, dev->config[PCI_SECONDARY_BUS]); + if (child_bus) { + info->has_devices = true; + info->devices = qmp_query_pci_devices(child_bus, + dev->config[PCI_SECONDARY_BUS]); + } + } + + return info; +} + +static PciDeviceInfo *qmp_query_pci_device(PCIDevice *dev, PCIBus *bus, + int bus_num) +{ + const pci_class_desc *desc; + PciDeviceInfo *info; + uint8_t type; + int class; + + info = g_malloc0(sizeof(*info)); + info->bus = bus_num; + info->slot = PCI_SLOT(dev->devfn); + info->function = PCI_FUNC(dev->devfn); + + class = pci_get_word(dev->config + PCI_CLASS_DEVICE); + info->class_info.q_class = class; + desc = get_class_desc(class); + if (desc->desc) { + info->class_info.has_desc = true; + info->class_info.desc = g_strdup(desc->desc); + } + + info->id.vendor = pci_get_word(dev->config + PCI_VENDOR_ID); + info->id.device = pci_get_word(dev->config + PCI_DEVICE_ID); + info->regions = qmp_query_pci_regions(dev); + info->qdev_id = g_strdup(dev->qdev.id ? dev->qdev.id : ""); + + if (dev->config[PCI_INTERRUPT_PIN] != 0) { + info->has_irq = true; + info->irq = dev->config[PCI_INTERRUPT_LINE]; + } + + type = dev->config[PCI_HEADER_TYPE] & ~PCI_HEADER_TYPE_MULTI_FUNCTION; + if (type == PCI_HEADER_TYPE_BRIDGE) { + info->has_pci_bridge = true; + info->pci_bridge = qmp_query_pci_bridge(dev, bus, bus_num); + } + + return info; +} + +PciDeviceInfoList *qmp_query_pci_devices(PCIBus *bus, int bus_num) +{ + PciDeviceInfoList *info, *head = NULL, *cur_item = NULL; + PCIDevice *dev; + int devfn; + + for (devfn = 0; devfn < ARRAY_SIZE(bus->devices); devfn++) { + dev = bus->devices[devfn]; + if (dev) { + info = g_malloc0(sizeof(*info)); + info->value = qmp_query_pci_device(dev, bus, bus_num); + + /* XXX: waiting for the qapi to support GSList */ + if (!cur_item) { + head = cur_item = info; + } else { + cur_item->next = info; + cur_item = info; + } + } + } + + return head; +} + + +static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) +{ + PCIDevice *d = (PCIDevice *)dev; + const pci_class_desc *desc; + char ctxt[64]; + PCIIORegion *r; + int i, class; + + class = pci_get_word(d->config + PCI_CLASS_DEVICE); + desc = pci_class_descriptions; + while (desc->desc && class != desc->class) { + desc++; + } + if (desc->desc) { + snprintf(ctxt, sizeof(ctxt), "%s", desc->desc); + } else { + snprintf(ctxt, sizeof(ctxt), "Class %04x", class); + } + + monitor_printf(mon, "%*sclass %s, addr %02x:%02x.%x, " + "pci id %04x:%04x (sub %04x:%04x)\n", + indent, "", ctxt, pci_bus_num(d->bus), + PCI_SLOT(d->devfn), PCI_FUNC(d->devfn), + pci_get_word(d->config + PCI_VENDOR_ID), + pci_get_word(d->config + PCI_DEVICE_ID), + pci_get_word(d->config + PCI_SUBSYSTEM_VENDOR_ID), + pci_get_word(d->config + PCI_SUBSYSTEM_ID)); + for (i = 0; i < PCI_NUM_REGIONS; i++) { + r = &d->io_regions[i]; + if (!r->size) { + continue; + } + monitor_printf(mon, "%*sbar %d: %s at 0x%"FMT_PCIBUS + " [0x%"FMT_PCIBUS"]\n", + indent, "", + i, r->type & PCI_BASE_ADDRESS_SPACE_IO ? "i/o" : "mem", + r->addr, r->addr + r->size - 1); + } +} + +static char *pcibus_get_dev_path(DeviceState *dev) +{ + PCIDevice *d = container_of(dev, PCIDevice, qdev); + PCIDevice *t; + int slot_depth; + /* Path format: Domain:00:Slot.Function:Slot.Function....:Slot.Function. + * 00 is added here to make this format compatible with + * domain:Bus:Slot.Func for systems without nested PCI bridges. + * Slot.Function list specifies the slot and function numbers for all + * devices on the path from root to the specific device. */ + const char *root_bus_path; + int root_bus_len; + char slot[] = ":SS.F"; + int slot_len = sizeof slot - 1 /* For '\0' */; + int path_len; + char *path, *p; + int s; + + root_bus_path = pci_root_bus_path(d); + root_bus_len = strlen(root_bus_path); + + /* Calculate # of slots on path between device and root. */; + slot_depth = 0; + for (t = d; t; t = t->bus->parent_dev) { + ++slot_depth; + } + + path_len = root_bus_len + slot_len * slot_depth; + + /* Allocate memory, fill in the terminating null byte. */ + path = g_malloc(path_len + 1 /* For '\0' */); + path[path_len] = '\0'; + + memcpy(path, root_bus_path, root_bus_len); + + /* Fill in slot numbers. We walk up from device to root, so need to print + * them in the reverse order, last to first. */ + p = path + path_len; + for (t = d; t; t = t->bus->parent_dev) { + p -= slot_len; + s = snprintf(slot, sizeof slot, ":%02x.%x", + PCI_SLOT(t->devfn), PCI_FUNC(t->devfn)); + assert(s == slot_len); + memcpy(p, slot, slot_len); + } + + return path; +} + +static char *pci_dev_fw_name(DeviceState *dev, char *buf, int len) +{ + PCIDevice *d = (PCIDevice *)dev; + const char *name = NULL; + const pci_class_desc *desc = pci_class_descriptions; + int class = pci_get_word(d->config + PCI_CLASS_DEVICE); + + while (desc->desc && + (class & ~desc->fw_ign_bits) != + (desc->class & ~desc->fw_ign_bits)) { + desc++; + } + + if (desc->desc) { + name = desc->fw_name; + } + + if (name) { + pstrcpy(buf, len, name); + } else { + snprintf(buf, len, "pci%04x,%04x", + pci_get_word(d->config + PCI_VENDOR_ID), + pci_get_word(d->config + PCI_DEVICE_ID)); + } + + return buf; +} + +static char *pcibus_get_fw_dev_path(DeviceState *dev) +{ + PCIDevice *d = (PCIDevice *)dev; + char path[50], name[33]; + int off; + + off = snprintf(path, sizeof(path), "%s@%x", + pci_dev_fw_name(dev, name, sizeof name), + PCI_SLOT(d->devfn)); + if (PCI_FUNC(d->devfn)) { + snprintf(path + off, sizeof(path) + off, ",%x", PCI_FUNC(d->devfn)); + } + return g_strdup(path); +} + +static const VMStateDescription vmstate_pcibus = { + .name = "PCIBUS", + .version_id = 1, + .minimum_version_id = 1, + .fields = (VMStateField[]) { + VMSTATE_INT32_EQUAL(nirq, PCIBus), + VMSTATE_VARRAY_INT32(irq_count, PCIBus, + nirq, 0, vmstate_info_int32, + int32_t), + VMSTATE_END_OF_LIST() + } +}; + +static void pci_bus_realize(BusState *qbus, Error **errp) +{ + PCIBus *bus = PCI_BUS(qbus); + + vmstate_register(NULL, -1, &vmstate_pcibus, bus); +} + +static void pci_bus_unrealize(BusState *qbus, Error **errp) +{ + PCIBus *bus = PCI_BUS(qbus); + + vmstate_unregister(NULL, &vmstate_pcibus, bus); +} + +/* + * Trigger pci bus reset under a given bus. + * Called via qbus_reset_all on RST# assert, after the devices + * have been reset qdev_reset_all-ed already. + */ +static void pcibus_reset(BusState *qbus) +{ + PCIBus *bus = DO_UPCAST(PCIBus, qbus, qbus); + int i; + + for (i = 0; i < ARRAY_SIZE(bus->devices); ++i) { + if (bus->devices[i]) { + pci_do_device_reset(bus->devices[i]); + } + } + + for (i = 0; i < bus->nirq; i++) { + assert(bus->irq_count[i] == 0); + } +} + +static void pci_bus_class_init(ObjectClass *klass, void *data) +{ + BusClass *k = BUS_CLASS(klass); + + k->print_dev = pcibus_dev_print; + k->get_dev_path = pcibus_get_dev_path; + k->get_fw_dev_path = pcibus_get_fw_dev_path; + k->realize = pci_bus_realize; + k->unrealize = pci_bus_unrealize; + k->reset = pcibus_reset; +} + +static const TypeInfo pci_bus_info = { + .name = TYPE_PCI_BUS, + .parent = TYPE_BUS, + .instance_size = sizeof(PCIBus), + .class_init = pci_bus_class_init, +}; + +static void pci_bus_register_types(void) +{ + type_register_static(&pci_bus_info); +} + +type_init(pci_bus_register_types) diff --git a/hw/ppc/ppc4xx_pci.c b/hw/ppc/ppc4xx_pci.c index 0bb3cdb..f5847bc 100644 --- a/hw/ppc/ppc4xx_pci.c +++ b/hw/ppc/ppc4xx_pci.c @@ -23,6 +23,7 @@ #include "hw/ppc/ppc.h" #include "hw/ppc/ppc4xx.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "hw/pci/pci_host.h" #include "exec/address-spaces.h"
diff --git a/hw/sh4/r2d.c b/hw/sh4/r2d.c index d1d0847..8dd2ce3 100644 --- a/hw/sh4/r2d.c +++ b/hw/sh4/r2d.c @@ -30,6 +30,7 @@ #include "sysemu/sysemu.h" #include "hw/boards.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "net/net.h" #include "sh7750_regs.h" #include "hw/ide.h" diff --git a/hw/sh4/sh_pci.c b/hw/sh4/sh_pci.c index a2f6d9e..f02e998 100644 --- a/hw/sh4/sh_pci.c +++ b/hw/sh4/sh_pci.c @@ -24,6 +24,7 @@ #include "hw/sysbus.h" #include "hw/sh4/sh.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "hw/pci/pci_host.h" #include "qemu/bswap.h" #include "exec/address-spaces.h" diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index be2d9b8..47077cd 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -337,8 +337,6 @@ typedef void (*pci_set_irq_fn)(void *opaque, int irq_num, int level); typedef int (*pci_map_irq_fn)(PCIDevice *pci_dev, int irq_num); typedef PCIINTxRoute (*pci_route_irq_fn)(void *opaque, int pin);
-#define TYPE_PCI_BUS "PCI" -#define PCI_BUS(obj) OBJECT_CHECK(PCIBus, (obj), TYPE_PCI_BUS) #define TYPE_PCIE_BUS "PCIE"
bool pci_bus_is_express(PCIBus *bus); @@ -370,6 +368,7 @@ void pci_bus_fire_intx_routing_notifier(PCIBus *bus); void pci_device_set_intx_routing_notifier(PCIDevice *dev, PCIINTxRoutingNotifier notifier); void pci_device_reset(PCIDevice *dev); +void pci_do_device_reset(PCIDevice *dev);
PCIDevice *pci_nic_init_nofail(NICInfo *nd, PCIBus *rootbus, const char *default_model, diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h index fabaeee..ea427a3 100644 --- a/include/hw/pci/pci_bus.h +++ b/include/hw/pci/pci_bus.h @@ -1,6 +1,8 @@ #ifndef QEMU_PCI_BUS_H #define QEMU_PCI_BUS_H
+#include "hw/pci/pci.h" + /* * PCI Bus and Bridge datastructures. * @@ -8,6 +10,12 @@ * use accessor functions in pci.h, pci_bridge.h */
+PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num); +PciDeviceInfoList *qmp_query_pci_devices(PCIBus *bus, int bus_num); + +#define TYPE_PCI_BUS "PCI" +#define PCI_BUS(obj) OBJECT_CHECK(PCIBus, (obj), TYPE_PCI_BUS) + struct PCIBus { BusState qbus; PCIIOMMUFunc iommu_fn;
From: Marcel Apfelbaum marcel.a@redhat.com
Refactoring it as a method of PCIBusClass will allow different implementations for subclasses.
Removed the assumption that the root bus does not have a parent device because is specific only to the default class implementation.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/pci/pci.c | 11 ++++------- hw/pci/pci_bus.c | 9 +++++++++ hw/vfio/pci.c | 1 + include/hw/pci/pci.h | 1 - include/hw/pci/pci_bus.h | 15 +++++++++++++++ 5 files changed, 29 insertions(+), 8 deletions(-)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 442d209..196989f 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -205,7 +205,10 @@ PCIBus *pci_device_root_bus(const PCIDevice *d) { PCIBus *bus = d->bus;
- while ((d = bus->parent_dev) != NULL) { + while (!pci_bus_is_root(bus)) { + d = bus->parent_dev; + assert(d != NULL); + bus = d->bus; }
@@ -218,7 +221,6 @@ const char *pci_root_bus_path(PCIDevice *dev) PCIHostState *host_bridge = PCI_HOST_BRIDGE(rootbus->qbus.parent); PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_GET_CLASS(host_bridge);
- assert(!rootbus->parent_dev); assert(host_bridge->bus == rootbus);
if (hc->root_bus_path) { @@ -250,11 +252,6 @@ bool pci_bus_is_express(PCIBus *bus) return object_dynamic_cast(OBJECT(bus), TYPE_PCIE_BUS); }
-bool pci_bus_is_root(PCIBus *bus) -{ - return !bus->parent_dev; -} - void pci_bus_new_inplace(PCIBus *bus, size_t bus_size, DeviceState *parent, const char *name, MemoryRegion *address_space_mem, diff --git a/hw/pci/pci_bus.c b/hw/pci/pci_bus.c index d156194..0922a75 100644 --- a/hw/pci/pci_bus.c +++ b/hw/pci/pci_bus.c @@ -464,9 +464,15 @@ static void pcibus_reset(BusState *qbus) } }
+static bool pcibus_is_root(PCIBus *bus) +{ + return !bus->parent_dev; +} + static void pci_bus_class_init(ObjectClass *klass, void *data) { BusClass *k = BUS_CLASS(klass); + PCIBusClass *pbc = PCI_BUS_CLASS(klass);
k->print_dev = pcibus_dev_print; k->get_dev_path = pcibus_get_dev_path; @@ -474,11 +480,14 @@ static void pci_bus_class_init(ObjectClass *klass, void *data) k->realize = pci_bus_realize; k->unrealize = pci_bus_unrealize; k->reset = pcibus_reset; + + pbc->is_root = pcibus_is_root; }
static const TypeInfo pci_bus_info = { .name = TYPE_PCI_BUS, .parent = TYPE_BUS, + .class_size = sizeof(PCIBusClass), .instance_size = sizeof(PCIBus), .class_init = pci_bus_class_init, }; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 84e9d99..e8057db 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -32,6 +32,7 @@ #include "hw/pci/msi.h" #include "hw/pci/msix.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "qemu-common.h" #include "qemu/error-report.h" #include "qemu/event_notifier.h" diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index 47077cd..ae2c4a5 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -340,7 +340,6 @@ typedef PCIINTxRoute (*pci_route_irq_fn)(void *opaque, int pin); #define TYPE_PCIE_BUS "PCIE"
bool pci_bus_is_express(PCIBus *bus); -bool pci_bus_is_root(PCIBus *bus); void pci_bus_new_inplace(PCIBus *bus, size_t bus_size, DeviceState *parent, const char *name, MemoryRegion *address_space_mem, diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h index ea427a3..306ef10 100644 --- a/include/hw/pci/pci_bus.h +++ b/include/hw/pci/pci_bus.h @@ -15,6 +15,16 @@ PciDeviceInfoList *qmp_query_pci_devices(PCIBus *bus, int bus_num);
#define TYPE_PCI_BUS "PCI" #define PCI_BUS(obj) OBJECT_CHECK(PCIBus, (obj), TYPE_PCI_BUS) +#define PCI_BUS_CLASS(klass) OBJECT_CLASS_CHECK(PCIBusClass, (klass), TYPE_PCI_BUS) +#define PCI_BUS_GET_CLASS(obj) OBJECT_GET_CLASS(PCIBusClass, (obj), TYPE_PCI_BUS) + +typedef struct PCIBusClass { + /*< private >*/ + BusClass parent_class; + /*< public >*/ + + bool (*is_root)(PCIBus *bus); +} PCIBusClass;
struct PCIBus { BusState qbus; @@ -39,6 +49,11 @@ struct PCIBus { int *irq_count; };
+static inline bool pci_bus_is_root(PCIBus *bus) +{ + return PCI_BUS_GET_CLASS(bus)->is_root(bus); +} + typedef struct PCIBridgeWindows PCIBridgeWindows;
/*
From: Marcel Apfelbaum marcel.a@redhat.com
Refactoring it as a method of PCIBusClass will allow different implementations for subclasses.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/i386/kvm/pci-assign.c | 1 + hw/pci/pci.c | 7 ------- hw/pci/pci_bus.c | 10 ++++++++++ hw/scsi/megasas.c | 1 + hw/xen/xen_pt.c | 1 + include/hw/pci/pci.h | 1 - include/hw/pci/pci_bus.h | 6 ++++++ 7 files changed, 19 insertions(+), 8 deletions(-)
diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c index 9db7c77..ad573ec 100644 --- a/hw/i386/kvm/pci-assign.c +++ b/hw/i386/kvm/pci-assign.c @@ -35,6 +35,7 @@ #include "qemu/range.h" #include "sysemu/sysemu.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "hw/pci/msi.h" #include "kvm_i386.h"
diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 196989f..e386f2c 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -301,13 +301,6 @@ PCIBus *pci_register_bus(DeviceState *parent, const char *name, return bus; }
-int pci_bus_num(PCIBus *s) -{ - if (pci_bus_is_root(s)) - return 0; /* pci host bridge */ - return s->parent_dev->config[PCI_SECONDARY_BUS]; -} - static int get_pci_config_device(QEMUFile *f, void *pv, size_t size) { PCIDevice *s = container_of(pv, PCIDevice, config); diff --git a/hw/pci/pci_bus.c b/hw/pci/pci_bus.c index 0922a75..ed99208 100644 --- a/hw/pci/pci_bus.c +++ b/hw/pci/pci_bus.c @@ -469,6 +469,15 @@ static bool pcibus_is_root(PCIBus *bus) return !bus->parent_dev; }
+static int pcibus_num(PCIBus *bus) +{ + if (pcibus_is_root(bus)) { + return 0; /* pci host bridge */ + } + + return bus->parent_dev->config[PCI_SECONDARY_BUS]; +} + static void pci_bus_class_init(ObjectClass *klass, void *data) { BusClass *k = BUS_CLASS(klass); @@ -482,6 +491,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data) k->reset = pcibus_reset;
pbc->is_root = pcibus_is_root; + pbc->bus_num = pcibus_num; }
static const TypeInfo pci_bus_info = { diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c index 4852237..fa4e3d0 100644 --- a/hw/scsi/megasas.c +++ b/hw/scsi/megasas.c @@ -20,6 +20,7 @@
#include "hw/hw.h" #include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "sysemu/dma.h" #include "sysemu/block-backend.h" #include "hw/pci/msi.h" diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c index f2893b2..cf56a48 100644 --- a/hw/xen/xen_pt.c +++ b/hw/xen/xen_pt.c @@ -55,6 +55,7 @@ #include <sys/ioctl.h>
#include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" #include "hw/xen/xen.h" #include "hw/xen/xen_backend.h" #include "xen_pt.h" diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index ae2c4a5..a69cf94 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -375,7 +375,6 @@ PCIDevice *pci_nic_init_nofail(NICInfo *nd, PCIBus *rootbus,
PCIDevice *pci_vga_init(PCIBus *bus);
-int pci_bus_num(PCIBus *s); void pci_for_each_device(PCIBus *bus, int bus_num, void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque), void *opaque); diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h index 306ef10..553814e 100644 --- a/include/hw/pci/pci_bus.h +++ b/include/hw/pci/pci_bus.h @@ -24,6 +24,7 @@ typedef struct PCIBusClass { /*< public >*/
bool (*is_root)(PCIBus *bus); + int (*bus_num)(PCIBus *bus); } PCIBusClass;
struct PCIBus { @@ -54,6 +55,11 @@ static inline bool pci_bus_is_root(PCIBus *bus) return PCI_BUS_GET_CLASS(bus)->is_root(bus); }
+static inline int pci_bus_num(PCIBus *bus) +{ + return PCI_BUS_GET_CLASS(bus)->bus_num(bus); +} + typedef struct PCIBridgeWindows PCIBridgeWindows;
/*
From: Marcel Apfelbaum marcel.a@redhat.com
This is a marker interface used to differentiate the "default" host bridge on a system with multiple host bridges. This differentiation is required only for pc machines for now by the ACPI subsystem.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/i386/acpi-build.c | 9 ++++++--- hw/pci-host/piix.c | 5 +++++ hw/pci-host/q35.c | 4 ++++ hw/pci/pci_host.c | 6 ++++++ include/hw/pci/pci_host.h | 7 +++++++ 5 files changed, 28 insertions(+), 3 deletions(-)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 44819b8..7cd011d 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -251,7 +251,8 @@ static void acpi_get_pci_info(PcPciInfo *info) Object *pci_host; bool ambiguous;
- pci_host = object_resolve_path_type("", TYPE_PCI_HOST_BRIDGE, &ambiguous); + pci_host = object_resolve_path_type("", TYPE_PCI_MAIN_HOST_BRIDGE, + &ambiguous); g_assert(!ambiguous); g_assert(pci_host);
@@ -1311,7 +1312,8 @@ build_ssdt(GArray *table_data, GArray *linker, PCIBus *bus = NULL; bool ambiguous;
- pci_host = object_resolve_path_type("", TYPE_PCI_HOST_BRIDGE, &ambiguous); + pci_host = object_resolve_path_type("", TYPE_PCI_MAIN_HOST_BRIDGE, + &ambiguous); if (!ambiguous && pci_host) { bus = PCI_HOST_BRIDGE(pci_host)->bus; } @@ -1656,7 +1658,8 @@ static bool acpi_get_mcfg(AcpiMcfgInfo *mcfg) QObject *o; bool ambiguous;
- pci_host = object_resolve_path_type("", TYPE_PCI_HOST_BRIDGE, &ambiguous); + pci_host = object_resolve_path_type("", TYPE_PCI_MAIN_HOST_BRIDGE, + &ambiguous); g_assert(!ambiguous); g_assert(pci_host);
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index 292b6e9..0033ab4 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -766,6 +766,11 @@ static const TypeInfo i440fx_pcihost_info = { .instance_size = sizeof(I440FXState), .instance_init = i440fx_pcihost_initfn, .class_init = i440fx_pcihost_class_init, + .interfaces = (InterfaceInfo[]) { + { TYPE_PCI_MAIN_HOST_BRIDGE }, + { } + } + };
static void i440fx_register_types(void) diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c index a42d0fb..70a86af 100644 --- a/hw/pci-host/q35.c +++ b/hw/pci-host/q35.c @@ -193,6 +193,10 @@ static const TypeInfo q35_host_info = { .instance_size = sizeof(Q35PCIHost), .instance_init = q35_host_initfn, .class_init = q35_host_class_init, + .interfaces = (InterfaceInfo[]) { + { TYPE_PCI_MAIN_HOST_BRIDGE }, + { } + } };
/**************************************************************************** diff --git a/hw/pci/pci_host.c b/hw/pci/pci_host.c index 3e26f92..87180c8 100644 --- a/hw/pci/pci_host.c +++ b/hw/pci/pci_host.c @@ -175,6 +175,11 @@ const MemoryRegionOps pci_host_data_be_ops = { .endianness = DEVICE_BIG_ENDIAN, };
+static const TypeInfo pci_main_host_interface_info = { + .name = TYPE_PCI_MAIN_HOST_BRIDGE, + .parent = TYPE_INTERFACE, +}; + static const TypeInfo pci_host_type_info = { .name = TYPE_PCI_HOST_BRIDGE, .parent = TYPE_SYS_BUS_DEVICE, @@ -185,6 +190,7 @@ static const TypeInfo pci_host_type_info = {
static void pci_host_register_types(void) { + type_register_static(&pci_main_host_interface_info); type_register_static(&pci_host_type_info); }
diff --git a/include/hw/pci/pci_host.h b/include/hw/pci/pci_host.h index ba31595..3c72e26 100644 --- a/include/hw/pci/pci_host.h +++ b/include/hw/pci/pci_host.h @@ -30,6 +30,13 @@
#include "hw/sysbus.h"
+/** + * Marker interface for classes whose instances can + * be main host bridges. It is intended to be used + * when the QOM tree includes multiple host bridges. + */ +#define TYPE_PCI_MAIN_HOST_BRIDGE "pci-main-host-bridge" + #define TYPE_PCI_HOST_BRIDGE "pci-host-bridge" #define PCI_HOST_BRIDGE(obj) \ OBJECT_CHECK(PCIHostState, (obj), TYPE_PCI_HOST_BRIDGE)
From: Marcel Apfelbaum marcel.a@redhat.com
Use the newer pci_bus_num to correctly get the root bus number.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/pci/pci.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c index e386f2c..53598bd 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -1266,7 +1266,8 @@ PciInfoList *qmp_query_pci(Error **errp)
QLIST_FOREACH(host_bridge, &pci_host_bridges, next) { info = g_malloc0(sizeof(*info)); - info->value = qmp_query_pci_bus(host_bridge->bus, 0); + info->value = qmp_query_pci_bus(host_bridge->bus, + pci_bus_num(host_bridge->bus));
/* XXX: waiting for the qapi to support GSList */ if (!cur_item) {
From: Marcel Apfelbaum marcel.a@redhat.com
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/pci/pci.c | 8 ++++---- include/hw/pci/pci_host.h | 4 ++++ 2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 53598bd..f0cf752 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -69,7 +69,7 @@ static void pci_del_option_rom(PCIDevice *pdev); static uint16_t pci_default_sub_vendor_id = PCI_SUBVENDOR_ID_REDHAT_QUMRANET; static uint16_t pci_default_sub_device_id = PCI_SUBDEVICE_ID_QEMU;
-static QLIST_HEAD(, PCIHostState) pci_host_bridges; +struct PCIHostQ pci_host_bridges = QLIST_HEAD_INITIALIZER(pci_host_bridges);
static int pci_bar(PCIDevice *d, int reg) { @@ -190,7 +190,7 @@ PCIBus *pci_find_primary_bus(void) PCIBus *primary_bus = NULL; PCIHostState *host;
- QLIST_FOREACH(host, &pci_host_bridges, next) { + HOST_BRIDGE_FOREACH(host) { if (primary_bus) { /* We have multiple root buses, refuse to select a primary */ return NULL; @@ -1264,7 +1264,7 @@ PciInfoList *qmp_query_pci(Error **errp) PciInfoList *info, *head = NULL, *cur_item = NULL; PCIHostState *host_bridge;
- QLIST_FOREACH(host_bridge, &pci_host_bridges, next) { + HOST_BRIDGE_FOREACH(host_bridge) { info = g_malloc0(sizeof(*info)); info->value = qmp_query_pci_bus(host_bridge->bus, pci_bus_num(host_bridge->bus)); @@ -1802,7 +1802,7 @@ int pci_qdev_find_device(const char *id, PCIDevice **pdev) PCIHostState *host_bridge; int rc = -ENODEV;
- QLIST_FOREACH(host_bridge, &pci_host_bridges, next) { + HOST_BRIDGE_FOREACH(host_bridge) { int tmp = pci_qdev_find_recursive(host_bridge->bus, id, pdev); if (!tmp) { rc = 0; diff --git a/include/hw/pci/pci_host.h b/include/hw/pci/pci_host.h index 3c72e26..ba5272f 100644 --- a/include/hw/pci/pci_host.h +++ b/include/hw/pci/pci_host.h @@ -63,6 +63,10 @@ typedef struct PCIHostBridgeClass { const char *(*root_bus_path)(PCIHostState *, PCIBus *); } PCIHostBridgeClass;
+QLIST_HEAD(PCIHostQ, PCIHostState); +extern struct PCIHostQ pci_host_bridges; +#define HOST_BRIDGE_FOREACH(host) QLIST_FOREACH(host, &pci_host_bridges, next) + /* common internal helpers for PCI/PCIe hosts, cut off overflows */ void pci_host_config_write_common(PCIDevice *pci_dev, uint32_t addr, uint32_t limit, uint32_t val, uint32_t len);
From: Marcel Apfelbaum marcel.a@redhat.com
PXB is a "light-weight" host bridge whose purpose is to enable the main host bridge to support multiple PCI root buses.
As oposed to PCI-2-PCI bridge's secondary bus, PXB's bus is a primary bus and can be associated with a NUMA node (different from the main host bridge) allowing the guest OS to recognize the proximity of a pass-through device to other resources as RAM and CPUs.
The PXB is composed from: - A primary PCI bus (can be associated with a NUMA node) Acts like a normal pci bus and from the functionality point of view is an "expansion" of the bus behind the main host bridge. - A pci-2-pci bridge behind the primary PCI bus where the actual devices will be attached. - A host-bridge PCI device Situated on the bus behind the main host bridge, allows the BIOS to configure the bus number and IO/mem resources. It does not have its own config/data register for configuration cycles, this being handled by the main host bridge. - A host-bridge sysbus to comply with QEMU current design.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/pci-bridge/Makefile.objs | 1 + hw/pci-bridge/pci_expander_bridge.c | 173 ++++++++++++++++++++++++++++++++++++ include/hw/pci/pci.h | 1 + 3 files changed, 175 insertions(+) create mode 100644 hw/pci-bridge/pci_expander_bridge.c
diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs index 968b369..632e442 100644 --- a/hw/pci-bridge/Makefile.objs +++ b/hw/pci-bridge/Makefile.objs @@ -1,4 +1,5 @@ common-obj-y += pci_bridge_dev.o +common-obj-y += pci_expander_bridge.o common-obj-y += ioh3420.o xio3130_upstream.o xio3130_downstream.o common-obj-y += i82801b11.o # NewWorld PowerMac diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c new file mode 100644 index 0000000..941f3c8 --- /dev/null +++ b/hw/pci-bridge/pci_expander_bridge.c @@ -0,0 +1,173 @@ +/* + * PCI Expander Bridge Device Emulation + * + * Copyright (C) 2014 Red Hat Inc + * + * Authors: + * Marcel Apfelbaum marcel.a@redhat.com + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "hw/pci/pci.h" +#include "hw/pci/pci_bus.h" +#include "hw/pci/pci_host.h" +#include "hw/pci/pci_bus.h" +#include "qemu/range.h" +#include "qemu/error-report.h" + +#define TYPE_PXB_BUS "pxb-bus" +#define PXB_BUS(obj) OBJECT_CHECK(PXBBus, (obj), TYPE_PXB_BUS) + +typedef struct PXBBus { + /*< private >*/ + PCIBus parent_obj; + /*< public >*/ + + char bus_path[8]; +} PXBBus; + +#define TYPE_PXB_DEVICE "pxb-device" +#define PXB_DEV(obj) OBJECT_CHECK(PXBDev, (obj), TYPE_PXB_DEVICE) + +typedef struct PXBDev { + /*< private >*/ + PCIDevice parent_obj; + /*< public >*/ + + uint8_t bus_nr; +} PXBDev; + +#define TYPE_PXB_HOST "pxb-host" + +static int pxb_bus_num(PCIBus *bus) +{ + PXBDev *pxb = PXB_DEV(bus->parent_dev); + + return pxb->bus_nr; +} + +static bool pxb_is_root(PCIBus *bus) +{ + return true; /* by definition */ +} + +static void pxb_bus_class_init(ObjectClass *class, void *data) +{ + PCIBusClass *pbc = PCI_BUS_CLASS(class); + + pbc->bus_num = pxb_bus_num; + pbc->is_root = pxb_is_root; +} + +static const TypeInfo pxb_bus_info = { + .name = TYPE_PXB_BUS, + .parent = TYPE_PCI_BUS, + .instance_size = sizeof(PXBBus), + .class_init = pxb_bus_class_init, +}; + +static const char *pxb_host_root_bus_path(PCIHostState *host_bridge, + PCIBus *rootbus) +{ + PXBBus *bus = PXB_BUS(rootbus); + + snprintf(bus->bus_path, 8, "0000:%02x", pxb_bus_num(rootbus)); + return bus->bus_path; +} + +static void pxb_host_class_init(ObjectClass *class, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(class); + PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class); + + dc->fw_name = "pci"; + hc->root_bus_path = pxb_host_root_bus_path; +} + +static const TypeInfo pxb_host_info = { + .name = TYPE_PXB_HOST, + .parent = TYPE_PCI_HOST_BRIDGE, + .class_init = pxb_host_class_init, +}; + +static int pxb_dev_initfn(PCIDevice *dev) +{ + PXBDev *pxb = PXB_DEV(dev); + DeviceState *ds, *bds; + PCIHostState *phs; + PCIBus *bus; + const char *dev_name = NULL; + + HOST_BRIDGE_FOREACH(phs) { + if (pxb->bus_nr == pci_bus_num(phs->bus)) { + error_report("Bus nr %d is already used by %s.", + pxb->bus_nr, phs->bus->qbus.name); + return -EINVAL; + } + } + + if (dev->qdev.id && *dev->qdev.id) { + dev_name = dev->qdev.id; + } + + ds = qdev_create(NULL, TYPE_PXB_HOST); + bus = pci_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS); + + bus->parent_dev = dev; + bus->address_space_mem = dev->bus->address_space_mem; + bus->address_space_io = dev->bus->address_space_io; + bus->map_irq = pci_swizzle_map_irq_fn; + + bds = qdev_create(BUS(bus), "pci-bridge"); + bds->id = dev_name; + qdev_prop_set_uint8(bds, "chassis_nr", pxb->bus_nr); + + PCI_HOST_BRIDGE(ds)->bus = bus; + + qdev_init_nofail(ds); + qdev_init_nofail(bds); + + pci_word_test_and_set_mask(dev->config + PCI_STATUS, + PCI_STATUS_66MHZ | PCI_STATUS_FAST_BACK); + pci_config_set_class(dev->config, PCI_CLASS_BRIDGE_HOST); + + return 0; +} + +static Property pxb_dev_properties[] = { + /* Note: 0 is not a legal a PXB bus number. */ + DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0), + DEFINE_PROP_END_OF_LIST(), +}; + +static void pxb_dev_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); + + k->init = pxb_dev_initfn; + k->vendor_id = PCI_VENDOR_ID_REDHAT; + k->device_id = PCI_DEVICE_ID_REDHAT_PXB; + k->class_id = PCI_CLASS_BRIDGE_HOST; + + dc->desc = "PCI Expander Bridge"; + dc->props = pxb_dev_properties; +} + +static const TypeInfo pxb_dev_info = { + .name = TYPE_PXB_DEVICE, + .parent = TYPE_PCI_DEVICE, + .instance_size = sizeof(PXBDev), + .class_init = pxb_dev_class_init, +}; + +static void pxb_register_types(void) +{ + type_register_static(&pxb_bus_info); + type_register_static(&pxb_host_info); + type_register_static(&pxb_dev_info); +} + +type_init(pxb_register_types) diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h index a69cf94..4325784 100644 --- a/include/hw/pci/pci.h +++ b/include/hw/pci/pci.h @@ -90,6 +90,7 @@ #define PCI_DEVICE_ID_REDHAT_TEST 0x0005 #define PCI_DEVICE_ID_REDHAT_SDHCI 0x0007 #define PCI_DEVICE_ID_REDHAT_PCIE_HOST 0x0008 +#define PCI_DEVICE_ID_REDHAT_PXB 0x0009 #define PCI_DEVICE_ID_REDHAT_QXL 0x0100
#define FMT_PCIBUS PRIx64
From: Marcel Apfelbaum marcel.a@redhat.com
The bios looks for 'etc/extra-pci-roots' to decide if is going to scan further buses after bus 0 tree.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/i386/pc.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c index ae3ef0a..71d2f5b 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1072,9 +1072,22 @@ typedef struct PcGuestInfoState { static void pc_guest_info_machine_done(Notifier *notifier, void *data) { + PCIHostState *host; + int hosts = 0; PcGuestInfoState *guest_info_state = container_of(notifier, PcGuestInfoState, machine_done); + HOST_BRIDGE_FOREACH(host) { + hosts++; + } + + if (hosts && guest_info_state->info.fw_cfg) { + uint64_t *val = g_malloc(sizeof(*val)); + *val = cpu_to_le64(hosts - 1); + fw_cfg_add_file(guest_info_state->info.fw_cfg, + "etc/extra-pci-roots", val, sizeof(*val)); + } + acpi_setup(&guest_info_state->info); }
From: Marcel Apfelbaum marcel.a@redhat.com
Instead of assuming it has only one bus, it enumerates all the host bridges until it finds the one with bus number corresponding with the config register.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/pci-host/piix.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 56 insertions(+), 1 deletion(-)
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index 0033ab4..3c3a192 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -255,6 +255,61 @@ static void i440fx_pcihost_get_pci_hole64_end(Object *obj, Visitor *v, visit_type_uint64(v, &w64.end, name, errp); }
+static PCIBus *i440fx_find_primary_bus(int bus_num) +{ + PCIHostState *host; + PCIBus *bus = NULL; + int current = -1; + + HOST_BRIDGE_FOREACH(host) { + int b = pci_bus_num(host->bus); + if (b <= bus_num && b > current) { + current = b; + bus = host->bus; + } + } + + return bus; +} + +static void i440fx_pcihost_data_write(void *opaque, hwaddr addr, + uint64_t val, unsigned len) +{ + uint32_t config_reg = PCI_HOST_BRIDGE(opaque)->config_reg; + + if (config_reg & (1u << 31)) { + int bus_num = (config_reg >> 16) & 0xFF; + PCIBus *bus = i440fx_find_primary_bus(bus_num); + + if (bus) { + pci_data_write(bus, config_reg | (addr & 3), val, len); + } + } +} + +static uint64_t i440fx_pcihost_data_read(void *opaque, + hwaddr addr, unsigned len) +{ + uint32_t config_reg = PCI_HOST_BRIDGE(opaque)->config_reg; + + if (config_reg & (1U << 31)) { + int bus_num = (config_reg >> 16) & 0xFF; + PCIBus *bus = i440fx_find_primary_bus(bus_num); + + if (bus) { + return pci_data_read(bus, config_reg | (addr & 3), len); + } + } + + return 0xffffffff; +} + +const MemoryRegionOps i440fx_pcihost_data_le_ops = { + .read = i440fx_pcihost_data_read, + .write = i440fx_pcihost_data_write, + .endianness = DEVICE_LITTLE_ENDIAN, +}; + static void i440fx_pcihost_initfn(Object *obj) { PCIHostState *s = PCI_HOST_BRIDGE(obj); @@ -262,7 +317,7 @@ static void i440fx_pcihost_initfn(Object *obj)
memory_region_init_io(&s->conf_mem, obj, &pci_host_conf_le_ops, s, "pci-conf-idx", 4); - memory_region_init_io(&s->data_mem, obj, &pci_host_data_le_ops, s, + memory_region_init_io(&s->data_mem, obj, &i440fx_pcihost_data_le_ops, s, "pci-conf-data", 4);
object_property_add(obj, PCI_HOST_PROP_PCI_HOLE_START, "int",
The bios does not index the pxb slot number when it computes the IRQ because it resides on bus 0 and not on the current bus. However Qemu routes the irq through bus 0 and adds the pxb slot to the IRQ computation.
Synchronize between bios and Qemu by canceling pxb's effect.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/pci-bridge/pci_expander_bridge.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c index 941f3c8..87515c1 100644 --- a/hw/pci-bridge/pci_expander_bridge.c +++ b/hw/pci-bridge/pci_expander_bridge.c @@ -92,6 +92,24 @@ static const TypeInfo pxb_host_info = { .class_init = pxb_host_class_init, };
+ +static int pxb_map_irq_fn(PCIDevice *pci_dev, int pin) +{ + PCIDevice *pxb = pci_dev->bus->parent_dev; + + /* + * The bios does not index the pxb slot number when + * it computes the IRQ because it resides on bus 0 + * and not on the current bus. + * However QEMU routes the irq through bus 0 and adds + * the pxb slot to the IRQ computation. + * + * Synchronize between bios and QEMU by canceling + * pxb's effect. + */ + return pin - PCI_SLOT(pxb->devfn); +} + static int pxb_dev_initfn(PCIDevice *dev) { PXBDev *pxb = PXB_DEV(dev); @@ -118,7 +136,7 @@ static int pxb_dev_initfn(PCIDevice *dev) bus->parent_dev = dev; bus->address_space_mem = dev->bus->address_space_mem; bus->address_space_io = dev->bus->address_space_io; - bus->map_irq = pci_swizzle_map_irq_fn; + bus->map_irq = pxb_map_irq_fn;
bds = qdev_create(BUS(bus), "pci-bridge"); bds->id = dev_name;
PCI root buses can be attached to a specific NUMA node. PCI buses are not attached be default to a NUMA node.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/pci/pci_bus.c | 7 +++++++ include/hw/pci/pci_bus.h | 6 ++++++ include/sysemu/sysemu.h | 1 + 3 files changed, 14 insertions(+)
diff --git a/hw/pci/pci_bus.c b/hw/pci/pci_bus.c index ed99208..15882a7 100644 --- a/hw/pci/pci_bus.c +++ b/hw/pci/pci_bus.c @@ -13,6 +13,7 @@ #include "hw/pci/pci_bus.h" #include "hw/pci/pci_bridge.h" #include "monitor/monitor.h" +#include "sysemu/sysemu.h"
typedef struct { uint16_t class; @@ -478,6 +479,11 @@ static int pcibus_num(PCIBus *bus) return bus->parent_dev->config[PCI_SECONDARY_BUS]; }
+static uint16_t pcibus_numa_node(PCIBus *bus) +{ + return NUMA_NODE_UNASSIGNED; +} + static void pci_bus_class_init(ObjectClass *klass, void *data) { BusClass *k = BUS_CLASS(klass); @@ -492,6 +498,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
pbc->is_root = pcibus_is_root; pbc->bus_num = pcibus_num; + pbc->numa_node = pcibus_numa_node; }
static const TypeInfo pci_bus_info = { diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h index 553814e..75cd1fa 100644 --- a/include/hw/pci/pci_bus.h +++ b/include/hw/pci/pci_bus.h @@ -25,6 +25,7 @@ typedef struct PCIBusClass {
bool (*is_root)(PCIBus *bus); int (*bus_num)(PCIBus *bus); + uint16_t (*numa_node)(PCIBus *bus); } PCIBusClass;
struct PCIBus { @@ -60,6 +61,11 @@ static inline int pci_bus_num(PCIBus *bus) return PCI_BUS_GET_CLASS(bus)->bus_num(bus); }
+static inline int pci_bus_numa_node(PCIBus *bus) +{ + return PCI_BUS_GET_CLASS(bus)->numa_node(bus); +} + typedef struct PCIBridgeWindows PCIBridgeWindows;
/* diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index e7135e1..934eb5d 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -136,6 +136,7 @@ extern const char *mem_path; extern int mem_prealloc;
#define MAX_NODES 128 +#define NUMA_NODE_UNASSIGNED MAX_NODES
/* The following shall be true for all CPUs: * cpu->cpu_index < max_cpus <= MAX_CPUMASK_BITS
The pxb can be attach to and existing numa node by specifying numa_node option that equals the desired numa nodeid.
Signed-off-by: Marcel Apfelbaum marcel@redhat.com --- hw/i386/acpi-build.c | 12 ++++++++++++ hw/pci-bridge/pci_expander_bridge.c | 17 +++++++++++++++++ 2 files changed, 29 insertions(+)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 7cd011d..b5c0e86 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -908,6 +908,7 @@ build_ssdt(GArray *table_data, GArray *linker,
for (info = info_list; info; info = info->next) { PciInfo *bus_info = info->value; + PCIHostState *host;
if (bus_info->bus == 0) { continue; @@ -924,6 +925,17 @@ build_ssdt(GArray *table_data, GArray *linker, aml_append(dev, aml_name_decl("_HID", aml_string("PNP0A03"))); aml_append(dev, aml_name_decl("_BBN", aml_int((uint8_t)bus_info->bus))); + + HOST_BRIDGE_FOREACH(host) { + if (pci_bus_num(host->bus) == bus_info->bus) { + int numa_node = pci_bus_numa_node(host->bus); + if (numa_node != NUMA_NODE_UNASSIGNED) { + aml_append(dev, + aml_name_decl("_PXM", aml_int(numa_node))); + } + } + } + aml_append(dev, build_prt()); crs = build_crs(pci, bus_info, &io_ranges, &mem_ranges); aml_append(dev, aml_name_decl("_CRS", crs)); diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c index 87515c1..9329aab 100644 --- a/hw/pci-bridge/pci_expander_bridge.c +++ b/hw/pci-bridge/pci_expander_bridge.c @@ -16,6 +16,7 @@ #include "hw/pci/pci_bus.h" #include "qemu/range.h" #include "qemu/error-report.h" +#include "sysemu/sysemu.h"
#define TYPE_PXB_BUS "pxb-bus" #define PXB_BUS(obj) OBJECT_CHECK(PXBBus, (obj), TYPE_PXB_BUS) @@ -37,6 +38,7 @@ typedef struct PXBDev { /*< public >*/
uint8_t bus_nr; + uint16_t numa_node; } PXBDev;
#define TYPE_PXB_HOST "pxb-host" @@ -53,12 +55,20 @@ static bool pxb_is_root(PCIBus *bus) return true; /* by definition */ }
+static uint16_t pxb_bus_numa_node(PCIBus *bus) +{ + PXBDev *pxb = PXB_DEV(bus->parent_dev); + + return pxb->numa_node; +} + static void pxb_bus_class_init(ObjectClass *class, void *data) { PCIBusClass *pbc = PCI_BUS_CLASS(class);
pbc->bus_num = pxb_bus_num; pbc->is_root = pxb_is_root; + pbc->numa_node = pxb_bus_numa_node; }
static const TypeInfo pxb_bus_info = { @@ -126,6 +136,12 @@ static int pxb_dev_initfn(PCIDevice *dev) } }
+ if (pxb->numa_node != NUMA_NODE_UNASSIGNED && + pxb->numa_node >= nb_numa_nodes) { + error_report("Illegal numa node %d.", pxb->numa_node); + return -EINVAL; + } + if (dev->qdev.id && *dev->qdev.id) { dev_name = dev->qdev.id; } @@ -157,6 +173,7 @@ static int pxb_dev_initfn(PCIDevice *dev) static Property pxb_dev_properties[] = { /* Note: 0 is not a legal a PXB bus number. */ DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0), + DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED), DEFINE_PROP_END_OF_LIST(), };
On So, 2015-03-08 at 13:16 +0200, Marcel Apfelbaum wrote:
Notes:
- Sorry for the late submission, I was waiting for dynamic ACPI series to get merged in order to submit - my bad.
- The prev version (v2) was wrongfully tagged by me as RFC, it was actually ready but not rebased. V3 only rebases with no actual functionality changed.
- This series is not that big, patches 1-8 are really small and can be submitted separately, however I preferred to keep them here to get the whole picture.
Which tree this does apply against? Do you have a git tree with this somewhere?
cheers, Gerd
On 03/09/2015 09:43 AM, Gerd Hoffmann wrote:
On So, 2015-03-08 at 13:16 +0200, Marcel Apfelbaum wrote:
Notes:
- Sorry for the late submission, I was waiting for dynamic ACPI series to get merged in order to submit - my bad.
- The prev version (v2) was wrongfully tagged by me as RFC, it was actually ready but not rebased. V3 only rebases with no actual functionality changed.
- This series is not that big, patches 1-8 are really small and can be submitted separately, however I preferred to keep them here to get the whole picture.
Which tree this does apply against? Do you have a git tree with this somewhere?
Michael's pci branch of: git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git
Thanks, Marcel
cheers, Gerd
On Mo, 2015-03-09 at 11:20 +0200, Marcel Apfelbaum wrote:
On 03/09/2015 09:43 AM, Gerd Hoffmann wrote:
On So, 2015-03-08 at 13:16 +0200, Marcel Apfelbaum wrote:
Notes:
- Sorry for the late submission, I was waiting for dynamic ACPI series to get merged in order to submit - my bad.
- The prev version (v2) was wrongfully tagged by me as RFC, it was actually ready but not rebased. V3 only rebases with no actual functionality changed.
- This series is not that big, patches 1-8 are really small and can be submitted separately, however I preferred to keep them here to get the whole picture.
Which tree this does apply against? Do you have a git tree with this somewhere?
Michael's pci branch of: git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git
Hmm, not working for me ...
nilsson kraxel ~/projects/qemu ((a3b66ab...))# git checkout -b testing mst/pci M roms/seabios Branch testing set up to track remote branch pci from mst. Switched to a new branch 'testing' nilsson kraxel ~/projects/qemu (testing)# git am -s ~/Downloads/patches/testing/[Qemu-devel]_[PATCH_v4_for-2.3_01_25 ]_acpi:_fix_aml_equal_term_implementation.mbox Applying: acpi: fix aml_equal term implementation error: patch failed: hw/acpi/aml-build.c:542 error: hw/acpi/aml-build.c: patch does not apply Patch failed at 0001 acpi: fix aml_equal term implementation The copy of the patch that failed is found in: /home/kraxel/projects/qemu/.git/rebase-apply/patch When you have resolved this problem, run "git am --resolved". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". nilsson kraxel ~/projects/qemu (testing|AM 1/25)#
cheers, Gerd
On 03/09/2015 12:18 PM, Gerd Hoffmann wrote:
On Mo, 2015-03-09 at 11:20 +0200, Marcel Apfelbaum wrote:
On 03/09/2015 09:43 AM, Gerd Hoffmann wrote:
On So, 2015-03-08 at 13:16 +0200, Marcel Apfelbaum wrote:
Notes:
- Sorry for the late submission, I was waiting for dynamic ACPI series to get merged in order to submit - my bad.
- The prev version (v2) was wrongfully tagged by me as RFC, it was actually ready but not rebased. V3 only rebases with no actual functionality changed.
- This series is not that big, patches 1-8 are really small and can be submitted separately, however I preferred to keep them here to get the whole picture.
Which tree this does apply against? Do you have a git tree with this somewhere?
Michael's pci branch of: git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git
Hmm, not working for me ...
nilsson kraxel ~/projects/qemu ((a3b66ab...))# git checkout -b testing mst/pci M roms/seabios Branch testing set up to track remote branch pci from mst. Switched to a new branch 'testing' nilsson kraxel ~/projects/qemu (testing)# git am -s ~/Downloads/patches/testing/[Qemu-devel]_[PATCH_v4_for-2.3_01_25 ]_acpi:_fix_aml_equal_term_implementation.mbox Applying: acpi: fix aml_equal term implementation error: patch failed: hw/acpi/aml-build.c:542 error: hw/acpi/aml-build.c: patch does not apply Patch failed at 0001 acpi: fix aml_equal term implementation The copy of the patch that failed is found in: /home/kraxel/projects/qemu/.git/rebase-apply/patch When you have resolved this problem, run "git am --resolved". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". nilsson kraxel ~/projects/qemu (testing|AM 1/25)#
My series is based on commit 09d219a. Try please on top of this commit. I'll rebase it to HEAD for v5 (today or tomorrow)
Thanks, Marcel
cheers, Gerd
Hi,
My series is based on commit 09d219a. Try please on top of this commit.
Ok, that works. Going to play with that now ;)
thanks, Gerd
On 03/09/2015 04:19 PM, Gerd Hoffmann wrote:
Hi,
My series is based on commit 09d219a. Try please on top of this commit.
Ok, that works. Going to play with that now ;)
Good luck! ... and tell me what you think :) If you need any help with the command line of the pxb device, let me know,.
Thanks, Marcel
thanks, Gerd
On Mo, 2015-03-09 at 18:26 +0200, Marcel Apfelbaum wrote:
On 03/09/2015 04:19 PM, Gerd Hoffmann wrote:
Hi,
My series is based on commit 09d219a. Try please on top of this commit.
Ok, that works. Going to play with that now ;)
Good luck! ... and tell me what you think :) If you need any help with the command line of the pxb device, let me know,.
First thing I've noticed: You need to define a numa node so you can pass a valid numa node to the pxb-device. Guess that is ok as the whole point of this is to assign pci devices to numa nodes. More complete test instructions would be nice though.
Second thing: Booting with an unpatched seabios has bad effects:
[root@localhost ~]# cat /proc/iomem 00000000-000fffff : PCI Bus 0000:10 00000000-00000fff : reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : reserved 000c0000-000c91ff : Video ROM 000c9800-000ca1ff : Adapter ROM 000ca800-000ccbff : Adapter ROM 000f0000-000fffff : reserved 000f0000-000fffff : System ROM 00100000-3ffdffff : System RAM 01000000-0174bde4 : Kernel code 0174bde5-01d30cff : Kernel data 01eaa000-0202afff : Kernel bss 3ffe0000-3fffffff : reserved fd000000-fdffffff : 0000:00:02.0 fd000000-fdffffff : bochs-drm febc0000-febdffff : 0000:00:03.0 febc0000-febdffff : e1000 febf0000-febf0fff : 0000:00:02.0 febf0000-febf0fff : bochs-drm fec00000-fec003ff : IOAPIC 0 fed00000-fed003ff : HPET 0 fed00000-fed003ff : PNP0103:00 fee00000-fee00fff : Local APIC feffc000-feffffff : reserved fffc0000-ffffffff : reserved
"PCI Bus 0000:10" is bogus and "PCI Bus 0000:00" isn't there at all.
cheers, Gerd
On 03/09/2015 06:55 PM, Gerd Hoffmann wrote:
On Mo, 2015-03-09 at 18:26 +0200, Marcel Apfelbaum wrote:
On 03/09/2015 04:19 PM, Gerd Hoffmann wrote:
Hi,
My series is based on commit 09d219a. Try please on top of this commit.
Ok, that works. Going to play with that now ;)
Good luck! ... and tell me what you think :) If you need any help with the command line of the pxb device, let me know,.
First thing I've noticed: You need to define a numa node so you can pass a valid numa node to the pxb-device. Guess that is ok as the whole point of this is to assign pci devices to numa nodes. More complete test instructions would be nice though.
Exactly, this is by design. But you can also use it without specifying the NUMA node...
A detailed command line would be:
[qemu-bin + storage options] -bios [seabios-dir]/out/bios.bin -L [seabios-dir]/out/ -m 2G -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 -numa node,nodeid=0,cpus=0,memdev=ram-node0 -object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1 -numa node,nodeid=1,cpus=1,memdev=ram-node1 -device pxb-device,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd -device pxb-device,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8 -device e1000,bus=bridge2,addr=0x3 -device pxb-device,id=bridge3,bus=pci.0,bus_nr=40 -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1
Here you have: - 2 NUMA nodes for the guest, 0 and 1. (both mapped to the same NUMA node in host, but you can and should put it in different host NUMA nodes) - a pxb host bridge attached to NUMA 1 with an e1000 behind it - a pxb host bridge attached to NUMA 0 with an e1000 behind it - a pxb host bridge not attached to any NUMA with a hard drive behind it.
As you can see, since you already "decide" NUMA mapping at command line, it is "natural" also to attach the pxbs to the NUMA nodes.
Second thing: Booting with an unpatched seabios has bad effects:
[root@localhost ~]# cat /proc/iomem 00000000-000fffff : PCI Bus 0000:10 00000000-00000fff : reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : reserved 000c0000-000c91ff : Video ROM 000c9800-000ca1ff : Adapter ROM 000ca800-000ccbff : Adapter ROM 000f0000-000fffff : reserved 000f0000-000fffff : System ROM 00100000-3ffdffff : System RAM 01000000-0174bde4 : Kernel code 0174bde5-01d30cff : Kernel data 01eaa000-0202afff : Kernel bss 3ffe0000-3fffffff : reserved fd000000-fdffffff : 0000:00:02.0 fd000000-fdffffff : bochs-drm febc0000-febdffff : 0000:00:03.0 febc0000-febdffff : e1000 febf0000-febf0fff : 0000:00:02.0 febf0000-febf0fff : bochs-drm fec00000-fec003ff : IOAPIC 0 fed00000-fed003ff : HPET 0 fed00000-fed003ff : PNP0103:00 fee00000-fee00fff : Local APIC feffc000-feffffff : reserved fffc0000-ffffffff : reserved
"PCI Bus 0000:10" is bogus and "PCI Bus 0000:00" isn't there at all.
Yes, you shouldn't use pxb if you are not using the corresponding SeaBIOS. However, as I understand we always attach a SeaBIOS binary with a QEMU release, so we should be OK with this.
And this is the reason I wanted bios support *before* the PXB device implementation, but anyway, even if we have them in the same time, as long as the release has both pxb and BIOS with pxb support, is OK. (I think...)
I appreciate you looking into this and if you need further assistance don't hesitate to mail me! :)
Thanks, Marcel
cheers, Gerd
On 03/10/2015 06:21 AM, Marcel Apfelbaum wrote:
On 03/09/2015 06:55 PM, Gerd Hoffmann wrote:
On Mo, 2015-03-09 at 18:26 +0200, Marcel Apfelbaum wrote:
On 03/09/2015 04:19 PM, Gerd Hoffmann wrote:
Hi,
My series is based on commit 09d219a. Try please on top of this commit.
Ok, that works. Going to play with that now ;)
Good luck! ... and tell me what you think :) If you need any help with the command line of the pxb device, let me know,.
First thing I've noticed: You need to define a numa node so you can pass a valid numa node to the pxb-device. Guess that is ok as the whole point of this is to assign pci devices to numa nodes. More complete test instructions would be nice though.
Exactly, this is by design. But you can also use it without specifying the NUMA node...
A detailed command line would be:
[qemu-bin + storage options] -bios [seabios-dir]/out/bios.bin -L [seabios-dir]/out/ -m 2G -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 -numa node,nodeid=0,cpus=0,memdev=ram-node0 -object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1 -numa node,nodeid=1,cpus=1,memdev=ram-node1 -device pxb-device,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd -device pxb-device,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8 -device e1000,bus=bridge2,addr=0x3 -device pxb-device,id=bridge3,bus=pci.0,bus_nr=40 -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1
I replayed this patchset on top of 09d219a "acpi: update generated files" and got this:
qemu-system-x86_64: -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0: NUMA node binding are not supported by this QEMU qemu-system-x86_64: -object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1: NUMA node binding are not supported by this QEMU
This is my exact command line:
/scratch/alexey/p/qemu-build/x86_x86_64/x86_64-softmmu/qemu-system-x86_64 \ -L /home/alexey/p/qemu/pc-bios/ \ -hda x86/fc19_24GB_x86.qcow2 \ -enable-kvm \ -kernel x86/vmlinuz-3.12.11-201.fc19.x86_64 \ -initrd x86/initramfs-3.12.11-201.fc19.x86_64.img \ -append "root=/dev/sda3 console=ttyS0" \ -nographic \ -nodefaults \ -chardev stdio,id=id2,signal=off,mux=on \ -device isa-serial,id=id3,chardev=id2 \ -mon id=id4,chardev=id2,mode=readline \ -m 2G \ -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 \ -numa node,nodeid=0,cpus=0,memdev=ram-node0 \ -object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1 \ -numa node,nodeid=1,cpus=1,memdev=ram-node1 \ -device pxb-device,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 \ -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd \ -device pxb-device,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8 \ -device e1000,bus=bridge2,addr=0x3 \ -device pxb-device,id=bridge3,bus=pci.0,bus_nr=40 \ -drive if=none,id=drive0,file=debian_lenny_powerpc_desktop.qcow2 \ -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1 \
What am I missing here?
What I actually wanted to find out (instead of asking what I am doing now) is is this PXB device a PCI device sitting on the same PCI host bus adapter (1) or it is a separate PHB (2) with its own PCI domain (new XXXX in XXXX:00:00.0 PCI address)? I would think it is (1) but then what exactly do you call "A primary PCI bus" here (that's my ignorance speaking, yes :) )? Thanks.
Here you have:
- 2 NUMA nodes for the guest, 0 and 1. (both mapped to the same NUMA node
in host, but you can and should put it in different host NUMA nodes)
- a pxb host bridge attached to NUMA 1 with an e1000 behind it
- a pxb host bridge attached to NUMA 0 with an e1000 behind it
- a pxb host bridge not attached to any NUMA with a hard drive behind it.
As you can see, since you already "decide" NUMA mapping at command line, it is "natural" also to attach the pxbs to the NUMA nodes.
Second thing: Booting with an unpatched seabios has bad effects:
[root@localhost ~]# cat /proc/iomem 00000000-000fffff : PCI Bus 0000:10 00000000-00000fff : reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : reserved 000c0000-000c91ff : Video ROM 000c9800-000ca1ff : Adapter ROM 000ca800-000ccbff : Adapter ROM 000f0000-000fffff : reserved 000f0000-000fffff : System ROM 00100000-3ffdffff : System RAM 01000000-0174bde4 : Kernel code 0174bde5-01d30cff : Kernel data 01eaa000-0202afff : Kernel bss 3ffe0000-3fffffff : reserved fd000000-fdffffff : 0000:00:02.0 fd000000-fdffffff : bochs-drm febc0000-febdffff : 0000:00:03.0 febc0000-febdffff : e1000 febf0000-febf0fff : 0000:00:02.0 febf0000-febf0fff : bochs-drm fec00000-fec003ff : IOAPIC 0 fed00000-fed003ff : HPET 0 fed00000-fed003ff : PNP0103:00 fee00000-fee00fff : Local APIC feffc000-feffffff : reserved fffc0000-ffffffff : reserved
"PCI Bus 0000:10" is bogus and "PCI Bus 0000:00" isn't there at all.
Yes, you shouldn't use pxb if you are not using the corresponding SeaBIOS. However, as I understand we always attach a SeaBIOS binary with a QEMU release, so we should be OK with this.
And this is the reason I wanted bios support *before* the PXB device implementation, but anyway, even if we have them in the same time, as long as the release has both pxb and BIOS with pxb support, is OK. (I think...)
I appreciate you looking into this and if you need further assistance don't hesitate to mail me! :)
Thanks, Marcel
cheers, Gerd
On 03/10/2015 08:23 AM, Alexey Kardashevskiy wrote:
On 03/10/2015 06:21 AM, Marcel Apfelbaum wrote:
On 03/09/2015 06:55 PM, Gerd Hoffmann wrote:
On Mo, 2015-03-09 at 18:26 +0200, Marcel Apfelbaum wrote:
On 03/09/2015 04:19 PM, Gerd Hoffmann wrote:
Hi,
My series is based on commit 09d219a. Try please on top of this commit.
Ok, that works. Going to play with that now ;)
Good luck! ... and tell me what you think :) If you need any help with the command line of the pxb device, let me know,.
First thing I've noticed: You need to define a numa node so you can pass a valid numa node to the pxb-device. Guess that is ok as the whole point of this is to assign pci devices to numa nodes. More complete test instructions would be nice though.
Exactly, this is by design. But you can also use it without specifying the NUMA node...
A detailed command line would be:
[qemu-bin + storage options] -bios [seabios-dir]/out/bios.bin -L [seabios-dir]/out/ -m 2G -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 -numa node,nodeid=0,cpus=0,memdev=ram-node0 -object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1 -numa node,nodeid=1,cpus=1,memdev=ram-node1 -device pxb-device,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd -device pxb-device,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8 -device e1000,bus=bridge2,addr=0x3 -device pxb-device,id=bridge3,bus=pci.0,bus_nr=40 -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1
I replayed this patchset on top of 09d219a "acpi: update generated files" and got this:
qemu-system-x86_64: -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0: NUMA node binding are not supported by this QEMU qemu-system-x86_64: -object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1: NUMA node binding are not supported by this QEMU
Hi,
Please check your configuration (after you run ./configure script). See if you have a line like this: - NUMA host support yes
This is my exact command line:
/scratch/alexey/p/qemu-build/x86_x86_64/x86_64-softmmu/qemu-system-x86_64 \ -L /home/alexey/p/qemu/pc-bios/ \ -hda x86/fc19_24GB_x86.qcow2 \ -enable-kvm \ -kernel x86/vmlinuz-3.12.11-201.fc19.x86_64 \ -initrd x86/initramfs-3.12.11-201.fc19.x86_64.img \ -append "root=/dev/sda3 console=ttyS0" \ -nographic \ -nodefaults \ -chardev stdio,id=id2,signal=off,mux=on \ -device isa-serial,id=id3,chardev=id2 \ -mon id=id4,chardev=id2,mode=readline \ -m 2G \ -object memory-backend-ram,size=1024M,policy=bind,host-nodes=0,id=ram-node0 \ -numa node,nodeid=0,cpus=0,memdev=ram-node0 \ -object memory-backend-ram,size=1024M,policy=interleave,host-nodes=0,id=ram-node1 \ -numa node,nodeid=1,cpus=1,memdev=ram-node1 \ -device pxb-device,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 \ -netdev user,id=nd-device e1000,bus=bridge1,addr=0x4,netdev=nd \ -device pxb-device,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8 \ -device e1000,bus=bridge2,addr=0x3 \ -device pxb-device,id=bridge3,bus=pci.0,bus_nr=40 \ -drive if=none,id=drive0,file=debian_lenny_powerpc_desktop.qcow2 \ -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1 \
What am I missing here?
See above, check for NUMA host support
What I actually wanted to find out (instead of asking what I am doing now) is is this PXB device a PCI device sitting on the same PCI host bus adapter (1) or it is a separate PHB (2) with its own PCI domain (new XXXX in XXXX:00:00.0 PCI address)? I would think it is (1) but then what exactly do you call "A primary PCI bus" here (that's my ignorance speaking, yes :) )? Thanks.
You are right, the PXB is a device on the piix host-bridge bus and its bus uses the same PCI domain. However, the bus behind is exposed through ACPI as Primary PCI bus and starts a new PCI hierarchy.
You have a similar approach on Intel 450x chipset: http://www.intel.com/design/chipsets/datashts/243771.htm Look for 82454NX PCI Expander Bridge (PXB)
Thanks, Marcel
Here you have:
- 2 NUMA nodes for the guest, 0 and 1. (both mapped to the same NUMA node
in host, but you can and should put it in different host NUMA nodes)
- a pxb host bridge attached to NUMA 1 with an e1000 behind it
- a pxb host bridge attached to NUMA 0 with an e1000 behind it
- a pxb host bridge not attached to any NUMA with a hard drive behind it.
As you can see, since you already "decide" NUMA mapping at command line, it is "natural" also to attach the pxbs to the NUMA nodes.
Second thing: Booting with an unpatched seabios has bad effects:
[root@localhost ~]# cat /proc/iomem 00000000-000fffff : PCI Bus 0000:10 00000000-00000fff : reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : reserved 000c0000-000c91ff : Video ROM 000c9800-000ca1ff : Adapter ROM 000ca800-000ccbff : Adapter ROM 000f0000-000fffff : reserved 000f0000-000fffff : System ROM 00100000-3ffdffff : System RAM 01000000-0174bde4 : Kernel code 0174bde5-01d30cff : Kernel data 01eaa000-0202afff : Kernel bss 3ffe0000-3fffffff : reserved fd000000-fdffffff : 0000:00:02.0 fd000000-fdffffff : bochs-drm febc0000-febdffff : 0000:00:03.0 febc0000-febdffff : e1000 febf0000-febf0fff : 0000:00:02.0 febf0000-febf0fff : bochs-drm fec00000-fec003ff : IOAPIC 0 fed00000-fed003ff : HPET 0 fed00000-fed003ff : PNP0103:00 fee00000-fee00fff : Local APIC feffc000-feffffff : reserved fffc0000-ffffffff : reserved
"PCI Bus 0000:10" is bogus and "PCI Bus 0000:00" isn't there at all.
Yes, you shouldn't use pxb if you are not using the corresponding SeaBIOS. However, as I understand we always attach a SeaBIOS binary with a QEMU release, so we should be OK with this.
And this is the reason I wanted bios support *before* the PXB device implementation, but anyway, even if we have them in the same time, as long as the release has both pxb and BIOS with pxb support, is OK. (I think...)
I appreciate you looking into this and if you need further assistance don't hesitate to mail me! :)
Thanks, Marcel
cheers, Gerd
Hi,
Second thing: Booting with an unpatched seabios has bad effects:
[root@localhost ~]# cat /proc/iomem 00000000-000fffff : PCI Bus 0000:10 00000000-00000fff : reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : reserved 000c0000-000c91ff : Video ROM 000c9800-000ca1ff : Adapter ROM 000ca800-000ccbff : Adapter ROM 000f0000-000fffff : reserved 000f0000-000fffff : System ROM 00100000-3ffdffff : System RAM 01000000-0174bde4 : Kernel code 0174bde5-01d30cff : Kernel data 01eaa000-0202afff : Kernel bss 3ffe0000-3fffffff : reserved fd000000-fdffffff : 0000:00:02.0 fd000000-fdffffff : bochs-drm febc0000-febdffff : 0000:00:03.0 febc0000-febdffff : e1000 febf0000-febf0fff : 0000:00:02.0 febf0000-febf0fff : bochs-drm fec00000-fec003ff : IOAPIC 0 fed00000-fed003ff : HPET 0 fed00000-fed003ff : PNP0103:00 fee00000-fee00fff : Local APIC feffc000-feffffff : reserved fffc0000-ffffffff : reserved
"PCI Bus 0000:10" is bogus and "PCI Bus 0000:00" isn't there at all.
Yes, you shouldn't use pxb if you are not using the corresponding SeaBIOS. However, as I understand we always attach a SeaBIOS binary with a QEMU release, so we should be OK with this.
IMO the qemu side should be more robust and not assume specific guest behavior. The guest firmware simply not configuring the pxb shouldn't cause the resources for bus 0 breaking that badly. pxb not working if you run firmware without pxb support is ok. But everything else should continue to work as it did before.
cheers, Gerd