Hi all,
On 16.05.20 18:07, Nico Huber wrote:
On 16.05.20 15:39, Peter Stuge wrote:
Even worse, I also could not find any explanation of how the new algorithm is different from the old, neither in commit messages nor in comments.
Such an explanation would have been nice, I agree. But I also see a problem here: How to describe an algorithm that just doesn't make any sense? And if we'd make a mistake documenting the existing code, wouldn't that make it worse?
I'm done reading through both versions. Looks like the old code wasn't that bad and both aren't much different. However, the major difference is exactly what hold the old code back: The way to decide where in the resource space we place everything.
Both versions run two passes: 1. propagate downstream requirements up through the bridges. 2. from top (domain) to bottom (devices) calculate the final place of resources of one level inside the window of the upper level.
In the 1. pass, the new code doesn't account for legacy 10-bit PCI I/O, and stops propagating *before* the domain.
In the 2. pass, the new code doesn't use a single continuous window to place resources in, but uses a memrange list of all available space. For every resource, ordered from largest to smallest alignment, simply the first fitting space in the memrange is chosen. This changes things at the domain level, but not further down at the bridge level, because for the latter the memrange will be continuous. The new code marks resource that it couldn't allocate as `assigned`, looks like bug / needs investigation.
But, what did the old code do? As said above, it uses a single conti- nuous window to place resources in. Like a stale comment above dev_ configure() states, discovering this window differs for I/O and memory resources:
* I/O resources grow upward. MEM resources grow downward.
Even though this comment was left in place, I'm certain that the new code doesn't do it like that. The old code only stopped in the 1. pass at the domain resource. So we knew exactly how much space was needed at this level and used this information to limit the available window *at the bottom*. So for memory resources, this window was moved up as far as possible *unless* the calculated size requirement overflew the window, then it was moved down (this is where the bugs start and why people kept telling me that the allocator would overlap things with DRAM; even though it can print a big "!! Resource didn't fit!!" error).
This allowed us to have two schemes to decide where the memory I/O hole below 4GiB starts. I've only learned today that the second scheme was intentionally supported when I've look at comments and code around find_pci_tolm() (including long removed chipset code):
1. Define the start of the memory I/O hole early. This is what all the Intel code does. We just say (sometimes configure in the devicetree) how much i/o space we want. Then we can reserve the DRAM space and the allocator should avoid it (old failed, it seems).
2. Don't reserve DRAM space, run the allocator, adapt DRAM mapping to the result. Still used by many AGESA based ports, it seems. Only supported by the old code.
It's probably easy to maintain compatibility with 2. by searching the memranges backwards. However, I doubt that these ports actually still support a moving start of the memory I/O hole, given that they have CBMEM at restore_top_of_low_cacheable(). I will have a look at the latest patches for AMD ports, maybe we don't have to do anything here.
Nico