On 24.10.2008 19:50, Tom Sylla wrote:
The HT topology is not really directly reflected in PCI config space. They are obviously linked, but there is not really a way to "map" a HT topology of Opteron nodes to a graphical view of config space, it just doesn't exist. The Opterons just are where they are.
Thanks for your detailed explanation. That clears up quite a bit of confusion. I had hoped that we could easily represent the 8-processor ladder vs. crossbar cHT topologies in a device tree/graph together with the ncHT links, but I'll leave the cHT stuff out for now.
All of the processor PCI devices are on Bus 0, and will always be on Bus 0. Devices 18-1f are the only ones that will ever exist for Opterons. The device number that each Opteron responds to is based on NodeID (0-7), which is set on each Opteron during discovery. There won't ever be "holes", the NodeIDs must always be contiguous. Node IDs are not set in stone based on topology, node 0 is always the BSP, but 1-7 can basically be distributed out any of the 3 links to any other CPUs in any manner.
Ah.
AGESA has a default "discovery method" (I think breadth first, lowest link number first) but it has options to over-ride the discovery mechanism to change the order of nodes in a system. All that matters is that the routing tables are correct and consistent for the traffic to get where it needs to and without deadlock.
Getting the routing tables right is non-trivial for MP setups, especially if we don't know how the hardware is wired. My hope was to be able to express the cHT topologies in a way which allows us to derive correct routing tables. I'm postponing that goal for now.
Once that is complete, the processors just show up in PCI as devices 18-1f (or fewer)
They show up on bus 0 as you wrote. Will/can any devices attached via ncHT also show up on bus 0? If we have multiple ncHT links, what decides about the bus numbers for each of them?
Here is an lpsci from one of our systems with 5 nodes (AMI BIOS with AGESA):
00:01.0 PCI bridge: Broadcom BCM5785 [HT1000] PCI/PCI-X Bridge 00:02.0 Host bridge: Broadcom BCM5785 [HT1000] Legacy South Bridge 00:02.1 IDE interface: Broadcom BCM5785 [HT1000] IDE 00:02.2 ISA bridge: Broadcom BCM5785 [HT1000] LPC 00:03.0 USB Controller: Broadcom BCM5785 [HT1000] USB (rev 01) 00:03.1 USB Controller: Broadcom BCM5785 [HT1000] USB (rev 01) 00:03.2 USB Controller: Broadcom BCM5785 [HT1000] USB (rev 01) 00:04.0 VGA compatible controller: XGI Technology Inc. (eXtreme Graphics Innovation) Volari Z7 00:06.0 PCI bridge: Broadcom HT2100 PCI-Express Bridge (rev a2) 00:07.0 PCI bridge: Broadcom HT2100 PCI-Express Bridge (rev a2) 00:08.0 PCI bridge: Broadcom HT2100 PCI-Express Bridge (rev a2) 00:09.0 PCI bridge: Broadcom HT2100 PCI-Express Bridge (rev a2) 00:0a.0 PCI bridge: Broadcom HT2100 PCI-Express Bridge (rev a2) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 00:1a.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:1a.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:1a.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:1a.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 00:1b.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:1b.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:1b.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:1b.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 00:1c.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:1c.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:1c.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:1c.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:0d.0 PCI bridge: Broadcom BCM5785 [HT1000] PCI/PCI-X Bridge (rev c0) 01:0e.0 IDE interface: Broadcom BCM5785 [HT1000] SATA (PATA/IDE Mode) 01:0e.1 IDE interface: Broadcom BCM5785 [HT1000] SATA (PATA/IDE Mode) 03:00.0 PCI bridge: PLX Technology, Inc. Unknown device 8518 (rev aa) 04:01.0 PCI bridge: PLX Technology, Inc. Unknown device 8518 (rev aa) 04:02.0 PCI bridge: PLX Technology, Inc. Unknown device 8518 (rev aa) 04:03.0 PCI bridge: PLX Technology, Inc. Unknown device 8518 (rev aa) 04:04.0 PCI bridge: PLX Technology, Inc. Unknown device 8518 (rev aa) 0c:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) 0c:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
If we add or remove processors, nothing beside the 18-1f devices will change (SB Bus numbers, device numbers, etc do not change). When we add another *non* coherent HT device attached to one of the Opterons, it gets a new bus number (we start at 20 with ours, but it is arbitrary). All of the routing associated with HT for both coherent and non-coherent is contained in the mapping registers and routing table registers in all of the Opterons. The mapping registers map mem/io/cfg regions to nodes, and the routing table says how to get to that node. The ncHT devices can have BARs, and take up memory mapped IO just the same as another PCI device.
If I understand you correctly, it would be easy to have 00:01.0-00:0a.0 appear as 01:01.0-01:0a.0 (bus 1) while still keeping the 18-1f devices on the hardcoded bus 0.
On Thu, Oct 23, 2008 at 10:40 PM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
I'm trying to understand how HT is modeled into PCI space so that I can propose the "right" way to handle it in the dts. Depending on whether I run lspci -t under coreboot or factory BIOS, different topologies will be displayed. That means looking at lspci is not going to tell me how the hardware really works.
I am a little bit confused by this. What are the exact differences you see between coreboot and factory? The number of Opterons should be that same. The position in config space of a particular socket may change, based on node discovery differences between the BIOSes. Their is no reason for other devices to move because of the HT changes, but they may move just by other differences in coreboot.
IIRC I saw a board which only had 18-1f on bus 0 and everything else on other buses. AFAICS having devices on the same bus as the processor devices or not is a topology difference.
Would you mind posting lspci -tvnn for that 5-processor board as well? It would help me a lot to understand this issue better.
Regards, Carl-Daniel