Remember when I said many months ago that something with PCI was fishy? That PCI devices were in the device tree twice and init functions were called multiple times? Turns out I was right and the bug is totally nasty. I finally tracked it down after tracing function calls for 6 hours in qemu.
pci_scan_bus(bus, ...) is not called with the expected bus type. Instead, it is called with a domain type from pci_domain_scan_bus(). pci_scan_bus(bus, ...) looks at bus->children and expects it to be a PCI device. Of course the children of the domain are the buses and bus->children is a PCI bus. That causes all lookups in the device tree to fail horribly.
This patch fixes the dts lookup, BUT it introduces new undesired behaviour:
- Only the first bus of a PCI domain is scanned. That's fixable in an obvious way.
- For dynamic PCI devices on qemu, have_resources is now always 0. The reason can be seen in this error message: "read_resources: bus_0(PCI_BUS: 0000) missing phase4_read_resources" Basically, we have phase4_read_resources defined for the PCI domain, but not for PCI bus 0. We can either inherit phase4_read_resources from the domain or we define phase4_read_resources for each PCI bus. I prefer the latter.
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Index: corebootv3-pci_scan_bus/device/pci_device.c =================================================================== --- corebootv3-pci_scan_bus/device/pci_device.c (Revision 842) +++ corebootv3-pci_scan_bus/device/pci_device.c (Arbeitskopie) @@ -636,6 +636,8 @@ vendor = mainboard_pci_subsystem_vendor; if (!device) device = mainboard_pci_subsystem_device; + } else { + printk(BIOS_DEBUG, "%s: Device not on_mainboard\n", dev_path(dev)); } #endif /* Set the subsystem vendor and device ID for mainboard devices. */ @@ -1098,6 +1100,10 @@
printk(BIOS_DEBUG, "%s start bus %p, bus->dev %p\n", __func__, bus, bus->dev); + if (bus->dev->path.type != DEVICE_PATH_PCI_BUS) + printk(BIOS_ERR, "ERROR: pci_scan_bus called with incorrect " + "bus->dev->path.type, path is %s\n", dev_path(bus->dev)); + #if PCI_BUS_SEGN_BITS printk(BIOS_DEBUG, "PCI: pci_scan_bus for bus %04x:%02x\n", bus->secondary >> 8, bus->secondary & 0xff); @@ -1139,6 +1145,8 @@ if ((PCI_FUNC(devfn) == 0x00) && (!dev || (dev->enabled && ((dev->hdr_type & 0x80) != 0x80)))) { + printk(BIOS_SPEW, "Not a multi function device, or the " + "device is not present. Skip to next device.\n"); devfn += 0x07; } } @@ -1186,8 +1194,9 @@ */ unsigned int pci_domain_scan_bus(struct device *dev, unsigned int max) { + printk(BIOS_SPEW, "pci_domain_scan_bus: calling pci_scan_bus\n"); /* There is only one link on this device, and it is always link 0. */ - return pci_scan_bus(&dev->link[0], PCI_DEVFN(0, 0), 0xff, max); + return pci_scan_bus(&dev->link[0].children->link[0], PCI_DEVFN(0, 0), 0xff, max); }
/** @@ -1277,6 +1286,7 @@ */ unsigned int pci_scan_bridge(struct device *dev, unsigned int max) { + printk(BIOS_SPEW, "pci_scan_bridge: calling pci_scan_bus\n"); return do_pci_scan_bridge(dev, max, pci_scan_bus); }
I'm not totally sure of this analysis, but, since I'll be testing them soon anyway, I don't see real harm here,
But this is NOT tested.
Acked-by: Ronald G. Minnich rminnich@gmail.com
ron minnich wrote:
I'm not totally sure of this analysis, but, since I'll be testing them soon anyway, I don't see real harm here,
But this is NOT tested.
It's also not working.
Acked-by: Ronald G. Minnich rminnich@gmail.com
You serious?
Carl-Daniel Hailfinger wrote:
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
NACK !
This patch fixes the dts lookup, BUT it introduces new undesired behaviour:
- Only the first bus of a PCI domain is scanned. That's fixable in an
obvious way.
Please just send that obvious patch and it can be reviewed.
This patch is unacceptable as it is.
On 29.08.2008 08:38, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
NACK !
This patch fixes the dts lookup, BUT it introduces new undesired behaviour:
- Only the first bus of a PCI domain is scanned. That's fixable in an
obvious way.
Please just send that obvious patch and it can be reviewed.
This patch is unacceptable as it is.
I split up the patch in a debug patch (sent a few minutes ago) and a bugfix (which I will send later).
Regards, Carl-Daniel
On 29.08.2008 08:38, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
NACK !
This patch fixes the dts lookup, BUT it introduces new undesired behaviour:
- Only the first bus of a PCI domain is scanned. That's fixable in an
obvious way.
The comment in the old code suggests that there can be only one PCI bus below a PCI domain ("There is only one link on this device, and it is always link 0"). Is that true? In that case, the new behaviour wouldn't be undesired, but conform to the spec.
Please just send that obvious patch and it can be reviewed.
Sure.
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Index: corebootv3-pci_scan_bus/device/pci_device.c =================================================================== --- corebootv3-pci_scan_bus/device/pci_device.c (Revision 845) +++ corebootv3-pci_scan_bus/device/pci_device.c (Arbeitskopie) @@ -1196,8 +1196,16 @@ unsigned int pci_domain_scan_bus(struct device *dev, unsigned int max) { printk(BIOS_SPEW, "pci_domain_scan_bus: calling pci_scan_bus\n"); - /* There is only one link on this device, and it is always link 0. */ - return pci_scan_bus(&dev->link[0], PCI_DEVFN(0, 0), 0xff, max); + /* There is only one link on this device, and it is always link 0. + * dev->link[0] for a PCI domain is the domain link. + * The child of the domain link is the PCI bus device. + * We want to scan the bus link of the PCI bus device. + * dev->link[0].children->link[0] is that PCI bus link. + * If there can be multiple PCI buses below a PCI domain, we have to + * iterate over the PCI buses in a loop. + */ + return pci_scan_bus(&dev->link[0].children->link[0], PCI_DEVFN(0, 0), + 0xff, max); }
/**
Carl-Daniel Hailfinger wrote:
* dev->link[0] for a PCI domain is the domain link.
* The child of the domain link is the PCI bus device.
"PCI bus device" - is that the host bridge?
* If there can be multiple PCI buses below a PCI domain, we have to
* iterate over the PCI buses in a loop.
Most likely more than one PCI bus is below a PCI domain. There can be an arbitrary number of bridges in a PCI domain.
On 29.08.2008 12:41, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
* dev->link[0] for a PCI domain is the domain link.
* The child of the domain link is the PCI bus device.
"PCI bus device" - is that the host bridge?
Let me illustrate this with an excerpt of our device tree.
struct device dev_domain_0 = { .path = {.type=DEVICE_PATH_PCI_DOMAIN,{.pci_domain={ .domain = 0x0 }}}, .device_configuration = &domain_0, .ops = &i440bx_domain, .links = 1, .link = { [0] = { .dev = &dev_domain_0, .link = 0, .children = &dev_bus_0 }, }, .bus = &dev_root.link[0], .dtsname = "domain_0", .enabled = 1 }; struct device dev_bus_0 = { .path = {.type=DEVICE_PATH_PCI_BUS,{.pci_bus={ .bus = 0x0 }}}, .next = &dev_domain_0, .links = 1, .link = { [0] = { .dev = &dev_bus_0, .link = 0, .children = &dev_pci_0_0 }, }, .bus = &dev_domain_0.link[0], .next = &dev_domain_0, .dtsname = "bus_0", .enabled = 1 }; struct device dev_pci_0_0 = { .path = {.type=DEVICE_PATH_PCI,{.pci={ .devfn = PCI_DEVFN(0x0, 0x0)}}}, .sibling = &dev_pci_1_1, .next = &dev_pci_1_1, .bus = &dev_bus_0.link[0], .next = &dev_pci_1_1, .dtsname = "pci_0_0", .enabled = 1 }; struct southbridge_intel_i82371eb_ide_config pci_1_1 = { .ide0_enable = 0x0, .ide1_enable = 0x0, }; /*pci_1_1*/ struct device dev_pci_1_1 = { .path = {.type=DEVICE_PATH_PCI,{.pci={ .devfn = PCI_DEVFN(0x1, 0x1)}}}, .device_configuration = &pci_1_1, .ops = &i82371eb_ide, .subsystem_vendor = 0x15ad, .subsystem_device = 0x1976, .on_mainboard = 1, .next = &dev_bus_0, .bus = &dev_bus_0.link[0], .next = &dev_bus_0, .dtsname = "pci_1_1", .enabled = 1 };
We call pci_domain_scan_bus(dev_domain_0, ...) and dev_domain_0->link[0] is the PCI domain link. However, pci_scan_bus() expects to be called with dev_bus_0->link[0] which equals dev_domain_0->link[0].children->link[0]. "pci bus device" refers to dev_bus_0.
* If there can be multiple PCI buses below a PCI domain, we have to
* iterate over the PCI buses in a loop.
Most likely more than one PCI bus is below a PCI domain. There can be an arbitrary number of bridges in a PCI domain.
Can multiple buses be directly below a domain or do these buses hang off a bridge on a bus? For the latter case, this should be handled by pci_scan_bridge.
Regards, Carl-Daniel
* Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net [080829 12:54]:
On 29.08.2008 12:41, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
* dev->link[0] for a PCI domain is the domain link.
* The child of the domain link is the PCI bus device.
"PCI bus device" - is that the host bridge?
Let me illustrate this with an excerpt of our device tree.
We call pci_domain_scan_bus(dev_domain_0, ...) and dev_domain_0->link[0] is the PCI domain link. However, pci_scan_bus() expects to be called with dev_bus_0->link[0] which equals dev_domain_0->link[0].children->link[0]. "pci bus device" refers to dev_bus_0.
Ok, so why do we have
a) a pci domain b) a pci bus "device" whatever that is supposed to depict c) a pci bridge (host bridge)
all describing the same physical entity.
We had weird stuff in v2, but now it's not fixed, it's just different weird.
Can multiple buses be directly below a domain or do these buses hang off a bridge on a bus? For the latter case, this should be handled by pci_scan_bridge.
A PCI domain is not a physical device. It doesnt even virtually exist on x86, at least not on pre-PCIe systems.
Stefan
so let's do this. Use the new v2 device tree visualizer on serengeti and see how it looks.
The evolution of v2 device tree turned into a mess when the acpi and domain stuff came in. i did the best I could with moving the code over and trying to untangle it but obviously I did not do enough. At the same time, I was trying very hard not to break v2 behaviour, since v2 behaviour is "tricky", but tricky for very good reasons -- it is working with lots of weird hardware.
"There is only one link on this device, and it is always link 0."-- that comment may have been introduced by me as I tried to puzzle through the v2 device tree code. I am not sure. It is not there in v2.
Given the fragility of the device tree code I'd say be very careful about big changes. Lots of people have broken the device tree code in the past with trivial changes. Stefan was right, my ACK was probably a mistake.
This fragility is a reflection of how difficult device enumeration on PCs is. There are lots of bugs and corner cases to cover in the hardware, and it makes it hard to get it just right. Further, as mentioned, the evolution of hardware had major impact on the code over the last 8 years.
I will review this discussion and try to understand the implications again.
I would suggest our first step is a v3 tree visualizer like the one in v2. This can be done with a mod to dtc. I think better understanding should precede any further device code patches. It is clear from this discussion that we need to know more, and that even core developers don't always know all the complications.
ron
On 29.08.2008 21:20, ron minnich wrote:
so let's do this. Use the new v2 device tree visualizer on serengeti and see how it looks.
The evolution of v2 device tree turned into a mess when the acpi and domain stuff came in. i did the best I could with moving the code over and trying to untangle it but obviously I did not do enough. At the same time, I was trying very hard not to break v2 behaviour, since v2 behaviour is "tricky", but tricky for very good reasons -- it is working with lots of weird hardware.
"There is only one link on this device, and it is always link 0."-- that comment may have been introduced by me as I tried to puzzle through the v2 device tree code. I am not sure. It is not there in v2.
Having read the PCI Bridge spec a few minutes ago, this looks rather clear to me. - We want to be able to scan a bus and represent it in memory, so we have to store a struct bus for it. That would be dev_bus_0.link[0]. - We also want to be able to represent a bus in the device tree to be able to correlate it with the devices on that bus, so we have to keep a struct device around as container for it. That would be dev_bus. - We also need to represent the parent bus of a device and the parent of that bus. That means our model (at least for bus/bridge/device) is consistent and makes sense.
However, I have yet to read up on PCI domains, so I won't comment on that.
Given the fragility of the device tree code I'd say be very careful about big changes. Lots of people have broken the device tree code in the past with trivial changes. Stefan was right, my ACK was probably a mistake.
This fragility is a reflection of how difficult device enumeration on PCs is. There are lots of bugs and corner cases to cover in the hardware, and it makes it hard to get it just right. Further, as mentioned, the evolution of hardware had major impact on the code over the last 8 years.
I will review this discussion and try to understand the implications again.
I would suggest our first step is a v3 tree visualizer like the one in v2. This can be done with a mod to dtc. I think better understanding should precede any further device code patches. It is clear from this discussion that we need to know more, and that even core developers don't always know all the complications.
By the way, I managed to break dtc while working on this: dtc only uses dev_fn as identifier for a PCI device. That gets us a name collision if we have the same dev_fn combination on multiple buses. Either we add a random unique ID to the struct name or we integrate the number of the parent bus as well.
Regards, Carl-Daniel
On Fri, Aug 29, 2008 at 1:36 PM, Carl-Daniel Hailfinger
Either we add a random unique ID to the struct name or we integrate the number of the parent bus as well.
yeah I know. I was aware of this and had it in my queue. dtc should have a 'unique-ifier' (a la U numbers on chips on schematics' which is appended or prepended.
Fix is welcome.
ron
ron minnich wrote:
On Fri, Aug 29, 2008 at 1:36 PM, Carl-Daniel Hailfinger
Either we add a random unique ID to the struct name or we integrate the number of the parent bus as well.
yeah I know. I was aware of this and had it in my queue. dtc should have a 'unique-ifier' (a la U numbers on chips on schematics' which is appended or prepended.
Maybe we should have a look at how UEFI does it.. They have a complete system of unique identifiers for their devices.
Stefan
On Fri, Aug 29, 2008 at 3:45 PM, Stefan Reinauer stepan@coresystems.de wrote:
ron minnich wrote:
On Fri, Aug 29, 2008 at 1:36 PM, Carl-Daniel Hailfinger
Either we add a random unique ID to the struct name or we integrate the number of the parent bus as well.
yeah I know. I was aware of this and had it in my queue. dtc should have a 'unique-ifier' (a la U numbers on chips on schematics' which is appended or prepended.
Maybe we should have a look at how UEFI does it.. They have a complete system of unique identifiers for their devices.
it's really easily solved: make the name of the struct contain the path of the parents to the root (long names) or just make the name be U####_rest of name.
The #### is generated from a global counter. This is really easy, it's what I did in v2 config tool :-)
ron
On 30.08.2008 00:16, ron minnich wrote:
On Fri, Aug 29, 2008 at 1:36 PM, Carl-Daniel Hailfinger
Either we add a random unique ID to the struct name or we integrate the number of the parent bus as well.
yeah I know. I was aware of this and had it in my queue. dtc should have a 'unique-ifier' (a la U numbers on chips on schematics' which is appended or prepended.
See my other mail ("[PATCH] v3: more uniqueness in the dtc") for a patch which always prepends the name of the parent device to the name of the current device.
Fix is welcome.
I hope we can work out a scheme that is nice.
Regards, Carl-Daniel
Carl-Daniel Hailfinger wrote:
By the way, I managed to break dtc while working on this: dtc only uses dev_fn as identifier for a PCI device. That gets us a name collision if we have the same dev_fn combination on multiple buses.
The bus is determined by the position of the device in the device tree. This has never been an issue in v2 or v3.
On 30.08.2008 00:55, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
By the way, I managed to break dtc while working on this: dtc only uses dev_fn as identifier for a PCI device. That gets us a name collision if we have the same dev_fn combination on multiple buses.
The bus is determined by the position of the device in the device tree. This has never been an issue in v2 or v3.
I was talking about struct names emitted by dtc. They have been broken since ages in v3 (probably forever). See my other mail with the dtc patch for an example dts which explodes.
Regards, Carl-Daniel
On Fri, Aug 29, 2008 at 3:59 PM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
On 30.08.2008 00:55, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
By the way, I managed to break dtc while working on this: dtc only uses dev_fn as identifier for a PCI device. That gets us a name collision if we have the same dev_fn combination on multiple buses.
The bus is determined by the position of the device in the device tree. This has never been an issue in v2 or v3.
I was talking about struct names emitted by dtc. They have been broken since ages in v3 (probably forever). See my other mail with the dtc patch for an example dts which explodes.
no, they have not been broken forever, fortunately. when we change to the syntax of pci@x,y from the older syntax of an arbitrary name, we broke it.
ron
ron minnich wrote:
so let's do this. Use the new v2 device tree visualizer on serengeti and see how it looks.
With a little for loop I was able to produce images for all mainboards
http://www.coresystems.de/~stepan/devicetree/dot http://www.coresystems.de/~stepan/devicetree/fdp
This fragility is a reflection of how difficult device enumeration on PCs is. There are lots of bugs and corner cases to cover in the hardware, and it makes it hard to get it just right. Further, as mentioned, the evolution of hardware had major impact on the code over the last 8 years.
And it should continue to do so. The device tree code has been mostly unchanged since 2004 or so.
2008/8/30 Stefan Reinauer stepan@coresystems.de
ron minnich wrote:
so let's do this. Use the new v2 device tree visualizer on serengeti and see how it looks.
With a little for loop I was able to produce images for all mainboards
http://www.coresystems.de/~stepan/devicetree/dothttp://www.coresystems.de/%7Estepan/devicetree/dot http://www.coresystems.de/~stepan/devicetree/fdphttp://www.coresystems.de/%7Estepan/devicetree/fdp
Thanks Stefan,
This is really helpful. I think it would be great if there were some way to color or designate devices per chip. I was looking at http://www.coresystems.de/~stepan/devicetree/dot/static-tyan_s2892.png and I couldn't see the second Opteron. Am I missing something? If each chip were colored or enclosed in a box it seems like it would make it easier.
Looking at http://www.coresystems.de/~stepan/devicetree/dot/static-tyan_s2895.png I see the second Opteron's northbridge_amd_amdk8 and it's related 6 pci functions, but I don't see two cpu_amd_socket940 apic or chip constructs.
Thanks, Myles
This patch creates static.dot visualizing the coreboot device tree (v2) I created it do debug problems with the device/resource allocator.
create pngs with $ fdp -o static.png -Tpng static.dot or $ dot -o static.png -Tpng static.dot
TODO/ideas: re-introduce colored links again, dropped for now node colors according to device type? add an agenda
On Fri, Aug 29, 2008 at 9:27 AM, Stefan Reinauer stepan@coresystems.de wrote:
a) a pci domain
that's my error. Basically, I figured at the time that - everything was going PCIe - it was simpler to just have a domain on everything, and have it be "empty" and 0 on non-PCIe systems for what little time we had them. I.e. rather than have #ifdef everywhere just have a fixed set of rules as to the hierarchy.
b) a pci bus "device" whatever that is supposed to depict c) a pci bridge (host bridge)
In many cases, we have chips that have a device function and a bridge function. Again, I may have made a mistake when I moved the code over. But there really are chips that have both functions.
Also, I was doing my best to recreate the *function* of the v2 code. I agree that as a result we have weirdness inherited from v2 and we would do well to try to fix that.
We can not fix all this stuff right away, but I am glad you are pointing out that we have a need for improvement ;-)
Thanks
ron
ron minnich wrote:
On Fri, Aug 29, 2008 at 9:27 AM, Stefan Reinauer stepan@coresystems.de wrote:
a) a pci domain
that's my error. Basically, I figured at the time that
- everything was going PCIe
- it was simpler to just have a domain on everything, and have it be
"empty" and 0 on non-PCIe systems for what little time we had them. I.e. rather than have #ifdef everywhere just have a fixed set of rules as to the hierarchy.
b) a pci bus "device" whatever that is supposed to depict c) a pci bridge (host bridge)
In many cases, we have chips that have a device function and a bridge function. Again, I may have made a mistake when I moved the code over. But there really are chips that have both functions.
hm.. so do we need the "bus" construct at all? As far as I understand you, a bus hangs off of either a bridge or a pci domain (which is kind of a top level bridge between the FSB and PCI)?
Just wondering, the more stuff we can toss out, the easier it might/will get.
On 29.08.2008 18:27, Stefan Reinauer wrote:
- Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net [080829 12:54]:
On 29.08.2008 12:41, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
* dev->link[0] for a PCI domain is the domain link.
* The child of the domain link is the PCI bus device.
"PCI bus device" - is that the host bridge?
Let me illustrate this with an excerpt of our device tree.
We call pci_domain_scan_bus(dev_domain_0, ...) and dev_domain_0->link[0] is the PCI domain link. However, pci_scan_bus() expects to be called with dev_bus_0->link[0] which equals dev_domain_0->link[0].children->link[0]. "pci bus device" refers to dev_bus_0.
Ok, so why do we have
a) a pci domain b) a pci bus "device" whatever that is supposed to depict c) a pci bridge (host bridge)
all describing the same physical entity.
We had weird stuff in v2, but now it's not fixed, it's just different weird.
Can multiple buses be directly below a domain or do these buses hang off a bridge on a bus? For the latter case, this should be handled by pci_scan_bridge.
A PCI domain is not a physical device. It doesnt even virtually exist on x86, at least not on pre-PCIe systems.
OK, let me rephrase my question and answer it in part withe the help of the relevant standards. Is it possible to have multiple top-level PCI Buses? Yes, but only in the sense that top-level Buses are attached to different Host Bridges.
The PCI-to-PCI Bridge Architecture 1.2 http://www.pcisig.com/members/downloads/specifications/conventional/ppb12.pdf standard says that each PCI Bridge has exactly one primary and one secondary interface, each with exactly one Bus attached. So it is not possible to have multiple PCI Buses directly below a PCI Bridge.
That means for my patch: IFF we only have one top-level PCI Bus (anything else is impossible with only one Host Bridge) my patch is completely correct. However, IFF there are multiple independent Host Bridges which do NOT share a PCI bus, we need to implement a function which iterates over these Host Bridges.
Regards, Carl-Daniel
Carl-Daniel Hailfinger wrote:
IFF we only have one top-level PCI Bus (anything else is impossible with only one Host Bridge) my patch is completely correct. However, IFF there are multiple independent Host Bridges which do NOT share a PCI bus, we need to implement a function which iterates over these Host Bridges.
Which happens on non-x86 systems and occasionally can happen on x86 PCIe.
On 30.08.2008 00:53, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
IFF we only have one top-level PCI Bus (anything else is impossible with only one Host Bridge) my patch is completely correct. However, IFF there are multiple independent Host Bridges which do NOT share a PCI bus, we need to implement a function which iterates over these Host Bridges.
Which happens on non-x86 systems and occasionally can happen on x86 PCIe.
How about this one? Tested on Qemu, fulfills your criteria. (The qemu target needs to be fixed, though).
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Index: corebootv3-pci_scan_bus/device/pci_device.c =================================================================== --- corebootv3-pci_scan_bus/device/pci_device.c (Revision 846) +++ corebootv3-pci_scan_bus/device/pci_device.c (Arbeitskopie) @@ -1196,8 +1196,24 @@ unsigned int pci_domain_scan_bus(struct device *dev, unsigned int max) { printk(BIOS_SPEW, "pci_domain_scan_bus: calling pci_scan_bus\n"); - /* There is only one link on this device, and it is always link 0. */ - return pci_scan_bus(&dev->link[0], PCI_DEVFN(0, 0), 0xff, max); + /* There is only one link on a bus, and it is always link 0. + * dev->link[0] for a PCI domain is the domain link. + * The child of the domain link is the PCI bus device. + * We want to scan the bus link of the PCI bus device. + * dev->link[0].children->link[0] is that PCI bus link. + * If there can be multiple PCI buses directly below a PCI domain, + * we have to iterate over the PCI buses in a loop. + */ +#if 0 + return pci_scan_bus(&dev->link[0].children->link[0], + PCI_DEVFN(0, 0), 0xff, max); +#else + struct device *list; + for (list = dev; list; list = list->sibling) + max = pci_scan_bus(&list->link[0].children->link[0], + PCI_DEVFN(0, 0), 0xff, max); + return max; +#endif }
/**
On 30.08.2008 04:15, Carl-Daniel Hailfinger wrote:
On 30.08.2008 00:53, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
IFF we only have one top-level PCI Bus (anything else is impossible with only one Host Bridge) my patch is completely correct. However, IFF there are multiple independent Host Bridges which do NOT share a PCI bus, we need to implement a function which iterates over these Host Bridges.
Which happens on non-x86 systems and occasionally can happen on x86 PCIe.
I'd like a lspci -tn for any such system. My x86 PCIe system has 6 buses, but they are all descending from one top-level bus.
How about this one? Tested on Qemu, fulfills your criteria. (The qemu target needs to be fixed, though).
Turns out that the code worked because pci_scan_bus couldn't care less about what the device tree says about the bus number. It doesn't look at bus->dev->path.pci_bus.bus which is set in the dts, but it uses bus->secondary which is not used by the dts at all. Our PCI device core needs to be changed to accommodate multiple top-level PCI buses. However, unless such a system turns up in practice, I'm inclined to keep the code very simple and add a warning (compile-time or run-time).
Regards, Carl-Daniel
On Thu, Aug 28, 2008 at 7:22 PM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
pci_scan_bus(bus, ...) is not called with the expected bus type. Instead, it is called with a domain type from pci_domain_scan_bus(). pci_scan_bus(bus, ...) looks at bus->children and expects it to be a PCI device. Of course the children of the domain are the buses and bus->children is a PCI bus. That causes all lookups in the device tree to fail horribly.
I am going to guess as to why this might have happened. In v2 we had this:
static void root_complex_enable_dev(struct device *dev) { /* Set the operations if it is a special bus type */ if (dev->path.type == DEVICE_PATH_PCI_DOMAIN) { dev->ops = &pci_domain_ops; } else if (dev->path.type == DEVICE_PATH_APIC_CLUSTER) { dev->ops = &cpu_bus_ops; } }
So we would swap the ops around as we walked the tree. This was very confusing to many people. I removed this confusion for v3, but maybe I missed something in the process. I'm not sure.
- For dynamic PCI devices on qemu, have_resources is now always 0. The
reason can be seen in this error message: "read_resources: bus_0(PCI_BUS: 0000) missing phase4_read_resources" Basically, we have phase4_read_resources defined for the PCI domain, but not for PCI bus 0. We can either inherit phase4_read_resources from the domain or we define phase4_read_resources for each PCI bus. I prefer the latter.
probably the latter, but we should back off a bit until we define the rules.
ron