[coreboot] Re: Resource allocator: Getting v4 back in

19 May 2020

      On Sun, May 17, 2020 at 2:01 PM Nico Huber nico.h@gmx.de wrote:
...
Hello together,
Furquan has landed his revert series of the new resource allocator. A
little sad, but this allows us to take a deep breath and fix things
without further breakage.
There is already one change series on Gerrit to bring it back [1]. This
one goes the most conservative way: It moves the old allocator code
aside, so we can have two versions in the tree and decide per chipset/
board which one to use.
Actually, I have kept the old resource allocator enabled only for the
AMD chipsets that are known to have issues and need to be fixed.
...
I told people that I had something similar in mind. But actually, I
don't like it very much. I fear, if we move things aside, they can be
left behind and we might soon have to discuss platform deprecation
again.
Having both old and new allocators gives us the option to move the
chipsets/boards to the older one in case issues are identified even
later on and not in the first pass of testing. So, I like the idea of
having both allocators in the tree at least for a while. As mentioned
above, the old allocator is currently limited to only the AMD chipsets
that are known to have breakages and require more than a quick fix:
https://review.coreboot.org/c/coreboot/+/41444. The other chipsets
that were identified as problematic last week have all been fixed
since: https://review.coreboot.org/c/coreboot/+/41374.
Another reason why I like having both in the tree is it serves as a
quick A/B test without anyone having to revert series of patches just
to test the old allocator.
...
It would also be harder to make changes in generic code like
device/pci_device when one has to be sure that both allocators work
fine.
For the most part, the rest of the code should not really be aware of
the internals of any of the allocators. (In practice it is not always
true). I understand that we had some assumptions in find_pci_tolm()
and how invalid resources are handled. With the following changes:
https://review.coreboot.org/c/coreboot/+/41524, those assumptions
should no longer be required. We just need a set of rules saying what
goes into the allocator and what is the state coming out.
...
So I want to propose alternatives. The obvious one, but also slowest,
is to move all chipset code forward and to test and patch all the code
before we merge the new allocator again. I might underestimate the pro-
blems, but I still think we could get this going within a week.
My understanding is that the AMD chipsets above are going to require
more work and hence more time. Other than that, I agree, I haven't
really seen any complaints about other chipsets (the ones that were
identified based on feedback and code inspection are already fixed
now) and should be possible to move those to the new allocator.
...
Another option: we could identify the differences in the new allocator
that had a part in the breakage and only provide a backward compatible
implementation for those. Actually, I think it's only one: the resource
constraining strategy for the domain. To make this compatible, we could
a) drop all but the biggest memrange below 4G for the domain, and
b) assign/steal resources from the end of the memrange at the domain
   level.
It would closely mimic what the old code tried to achieve, so seems
safe, somehow.
I don't think it is right to assume that the biggest window should
always be chosen. Yes, we can start by allocating resources top-down
by always selecting the last window and using its top. I already have
patches uploaded for this (which need a small change to select the top
of last memory): https://review.coreboot.org/c/coreboot/+/41418. But,
I don't really want to take that change in because we will continue
masking the issues for chipsets/mainboards which were not really doing
the right thing. Instead, it would be better to identify such chipsets
and fix them.
...
I can write patches for the latter, if we want to try this way.
Nico
[1] https://review.coreboot.org/c/coreboot/+/41442/
PS. Had some random thoughts I don't want to forget but didn't know
    where to put them down:
    * Debug messages. There is a lot of BIOS_SPEW (was already like this
      in the old code). Things that are only printed once per pass per
      bridge shouldn't be considered SPEW, IMO.
I already have couple of changes to improve the debug logging:
https://review.coreboot.org/c/coreboot/+/41477
...
* The above-4G memrange should be aligned to a power of 2 for better
  MTRR usage (if that isn't the case already?).

All mem resource allocation already happens at 4KiB alignment.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[coreboot] Re: Resource allocator: Getting v4 back in