On Sun, May 17, 2020 at 2:01 PM Nico Huber nico.h@gmx.de wrote:
Hello together,
Furquan has landed his revert series of the new resource allocator. A little sad, but this allows us to take a deep breath and fix things without further breakage.
There is already one change series on Gerrit to bring it back [1]. This one goes the most conservative way: It moves the old allocator code aside, so we can have two versions in the tree and decide per chipset/ board which one to use.
Actually, I have kept the old resource allocator enabled only for the AMD chipsets that are known to have issues and need to be fixed.
I told people that I had something similar in mind. But actually, I don't like it very much. I fear, if we move things aside, they can be left behind and we might soon have to discuss platform deprecation again.
Having both old and new allocators gives us the option to move the chipsets/boards to the older one in case issues are identified even later on and not in the first pass of testing. So, I like the idea of having both allocators in the tree at least for a while. As mentioned above, the old allocator is currently limited to only the AMD chipsets that are known to have breakages and require more than a quick fix: https://review.coreboot.org/c/coreboot/+/41444. The other chipsets that were identified as problematic last week have all been fixed since: https://review.coreboot.org/c/coreboot/+/41374.
Another reason why I like having both in the tree is it serves as a quick A/B test without anyone having to revert series of patches just to test the old allocator.
It would also be harder to make changes in generic code like device/pci_device when one has to be sure that both allocators work fine.
For the most part, the rest of the code should not really be aware of the internals of any of the allocators. (In practice it is not always true). I understand that we had some assumptions in find_pci_tolm() and how invalid resources are handled. With the following changes: https://review.coreboot.org/c/coreboot/+/41524, those assumptions should no longer be required. We just need a set of rules saying what goes into the allocator and what is the state coming out.
So I want to propose alternatives. The obvious one, but also slowest, is to move all chipset code forward and to test and patch all the code before we merge the new allocator again. I might underestimate the pro- blems, but I still think we could get this going within a week.
My understanding is that the AMD chipsets above are going to require more work and hence more time. Other than that, I agree, I haven't really seen any complaints about other chipsets (the ones that were identified based on feedback and code inspection are already fixed now) and should be possible to move those to the new allocator.
Another option: we could identify the differences in the new allocator that had a part in the breakage and only provide a backward compatible implementation for those. Actually, I think it's only one: the resource constraining strategy for the domain. To make this compatible, we could a) drop all but the biggest memrange below 4G for the domain, and b) assign/steal resources from the end of the memrange at the domain level. It would closely mimic what the old code tried to achieve, so seems safe, somehow.
I don't think it is right to assume that the biggest window should always be chosen. Yes, we can start by allocating resources top-down by always selecting the last window and using its top. I already have patches uploaded for this (which need a small change to select the top of last memory): https://review.coreboot.org/c/coreboot/+/41418. But, I don't really want to take that change in because we will continue masking the issues for chipsets/mainboards which were not really doing the right thing. Instead, it would be better to identify such chipsets and fix them.
I can write patches for the latter, if we want to try this way.
Nico
[1] https://review.coreboot.org/c/coreboot/+/41442/
PS. Had some random thoughts I don't want to forget but didn't know where to put them down: * Debug messages. There is a lot of BIOS_SPEW (was already like this in the old code). Things that are only printed once per pass per bridge shouldn't be considered SPEW, IMO.
I already have couple of changes to improve the debug logging: https://review.coreboot.org/c/coreboot/+/41477
* The above-4G memrange should be aligned to a power of 2 for better MTRR usage (if that isn't the case already?).
All mem resource allocation already happens at 4KiB alignment.