Hello together,
Furquan has landed his revert series of the new resource allocator. A little sad, but this allows us to take a deep breath and fix things without further breakage.
There is already one change series on Gerrit to bring it back [1]. This one goes the most conservative way: It moves the old allocator code aside, so we can have two versions in the tree and decide per chipset/ board which one to use.
I told people that I had something similar in mind. But actually, I don't like it very much. I fear, if we move things aside, they can be left behind and we might soon have to discuss platform deprecation again. It would also be harder to make changes in generic code like device/pci_device when one has to be sure that both allocators work fine.
So I want to propose alternatives. The obvious one, but also slowest, is to move all chipset code forward and to test and patch all the code before we merge the new allocator again. I might underestimate the pro- blems, but I still think we could get this going within a week.
Another option: we could identify the differences in the new allocator that had a part in the breakage and only provide a backward compatible implementation for those. Actually, I think it's only one: the resource constraining strategy for the domain. To make this compatible, we could a) drop all but the biggest memrange below 4G for the domain, and b) assign/steal resources from the end of the memrange at the domain level. It would closely mimic what the old code tried to achieve, so seems safe, somehow.
I can write patches for the latter, if we want to try this way.
Nico
[1] https://review.coreboot.org/c/coreboot/+/41442/
PS. Had some random thoughts I don't want to forget but didn't know where to put them down: * Debug messages. There is a lot of BIOS_SPEW (was already like this in the old code). Things that are only printed once per pass per bridge shouldn't be considered SPEW, IMO. * The above-4G memrange should be aligned to a power of 2 for better MTRR usage (if that isn't the case already?).
On Sun, May 17, 2020 at 2:01 PM Nico Huber nico.h@gmx.de wrote:
Hello together,
Furquan has landed his revert series of the new resource allocator. A little sad, but this allows us to take a deep breath and fix things without further breakage.
There is already one change series on Gerrit to bring it back [1]. This one goes the most conservative way: It moves the old allocator code aside, so we can have two versions in the tree and decide per chipset/ board which one to use.
Actually, I have kept the old resource allocator enabled only for the AMD chipsets that are known to have issues and need to be fixed.
I told people that I had something similar in mind. But actually, I don't like it very much. I fear, if we move things aside, they can be left behind and we might soon have to discuss platform deprecation again.
Having both old and new allocators gives us the option to move the chipsets/boards to the older one in case issues are identified even later on and not in the first pass of testing. So, I like the idea of having both allocators in the tree at least for a while. As mentioned above, the old allocator is currently limited to only the AMD chipsets that are known to have breakages and require more than a quick fix: https://review.coreboot.org/c/coreboot/+/41444. The other chipsets that were identified as problematic last week have all been fixed since: https://review.coreboot.org/c/coreboot/+/41374.
Another reason why I like having both in the tree is it serves as a quick A/B test without anyone having to revert series of patches just to test the old allocator.
It would also be harder to make changes in generic code like device/pci_device when one has to be sure that both allocators work fine.
For the most part, the rest of the code should not really be aware of the internals of any of the allocators. (In practice it is not always true). I understand that we had some assumptions in find_pci_tolm() and how invalid resources are handled. With the following changes: https://review.coreboot.org/c/coreboot/+/41524, those assumptions should no longer be required. We just need a set of rules saying what goes into the allocator and what is the state coming out.
So I want to propose alternatives. The obvious one, but also slowest, is to move all chipset code forward and to test and patch all the code before we merge the new allocator again. I might underestimate the pro- blems, but I still think we could get this going within a week.
My understanding is that the AMD chipsets above are going to require more work and hence more time. Other than that, I agree, I haven't really seen any complaints about other chipsets (the ones that were identified based on feedback and code inspection are already fixed now) and should be possible to move those to the new allocator.
Another option: we could identify the differences in the new allocator that had a part in the breakage and only provide a backward compatible implementation for those. Actually, I think it's only one: the resource constraining strategy for the domain. To make this compatible, we could a) drop all but the biggest memrange below 4G for the domain, and b) assign/steal resources from the end of the memrange at the domain level. It would closely mimic what the old code tried to achieve, so seems safe, somehow.
I don't think it is right to assume that the biggest window should always be chosen. Yes, we can start by allocating resources top-down by always selecting the last window and using its top. I already have patches uploaded for this (which need a small change to select the top of last memory): https://review.coreboot.org/c/coreboot/+/41418. But, I don't really want to take that change in because we will continue masking the issues for chipsets/mainboards which were not really doing the right thing. Instead, it would be better to identify such chipsets and fix them.
I can write patches for the latter, if we want to try this way.
Nico
[1] https://review.coreboot.org/c/coreboot/+/41442/
PS. Had some random thoughts I don't want to forget but didn't know where to put them down: * Debug messages. There is a lot of BIOS_SPEW (was already like this in the old code). Things that are only printed once per pass per bridge shouldn't be considered SPEW, IMO.
I already have couple of changes to improve the debug logging: https://review.coreboot.org/c/coreboot/+/41477
* The above-4G memrange should be aligned to a power of 2 for better MTRR usage (if that isn't the case already?).
All mem resource allocation already happens at 4KiB alignment.
On 19.05.20 02:07, Furquan Shaikh wrote:
On Sun, May 17, 2020 at 2:01 PM Nico Huber nico.h@gmx.de wrote:
I told people that I had something similar in mind. But actually, I don't like it very much. I fear, if we move things aside, they can be left behind and we might soon have to discuss platform deprecation again.
Having both old and new allocators gives us the option to move the chipsets/boards to the older one in case issues are identified even later on and not in the first pass of testing. So, I like the idea of having both allocators in the tree at least for a while.
ok, let's do this.
As mentioned above, the old allocator is currently limited to only the AMD chipsets that are known to have breakages and require more than a quick fix: https://review.coreboot.org/c/coreboot/+/41444. The other chipsets that were identified as problematic last week have all been fixed since: https://review.coreboot.org/c/coreboot/+/41374.
Another reason why I like having both in the tree is it serves as a quick A/B test without anyone having to revert series of patches just to test the old allocator.
That's a great idea. I will likely pick up the opportunity to add more, optional things to test.
It would also be harder to make changes in generic code like device/pci_device when one has to be sure that both allocators work fine.
For the most part, the rest of the code should not really be aware of the internals of any of the allocators. (In practice it is not always true). I understand that we had some assumptions in find_pci_tolm() and how invalid resources are handled. With the following changes: https://review.coreboot.org/c/coreboot/+/41524, those assumptions should no longer be required. We just need a set of rules saying what goes into the allocator and what is the state coming out.
Let's try it. We can put this in Documentation/ once we agree on the rules.
INPUT -----
The allocator processes, per domain sub-tree, all resources with IORESOURCE_IO or IORESOURCE_MEM set. From here on, these are simply referred to as *resources*.
Resources with the IORESOURCE_FIXED bit set are referred to as *fixed resources*, resources with the IORESOURCE_BRIDGE bit set as *bridge resources*, and resources with the IORESOURCE_ASSIGNED bit set as *assigned resources*.
Resources with IORESOURCE_IO set must not have IORESOURCE_MEM set.
Resources that should not be changed by the allocator must be fixed resources.
Fixed resources must be assigned resources and have a valid `base` address pre-assigned.
A fixed resource must not overlap with any other fixed resource within the same address space (IO / MEM), unless either of them is a bridge resource.
A fixed bridge resource must not overlap with any other fixed resource within the same address space and on the same bus (i.e. the same device or its siblings), unless exactly one of them has IORESOURCE_SUBTRACTIVE set.
OUTPUT ------
All device attributes beside non-fixed resources are left unaltered.
The structure of resource lists (`next` pointers) is left unaltered.
The `index` and `gran` attributes of all resources are left unaltered.
The `size`, `limit` and `align` attributes of all non-bridge resources are left unaltered.
With exception of IORESOURCE_ASSIGNED, all `flags` of all resources are left unaltered.
All non-fixed assigned resources downstream of a bridge must be covered by an assigned bridge resource of that bridge.
For all assigned resources, the INPUT rules must apply, where all assigned resources would be treated as fixed resources.
Another option: we could identify the differences in the new allocator that had a part in the breakage and only provide a backward compatible implementation for those. Actually, I think it's only one: the resource constraining strategy for the domain. To make this compatible, we could a) drop all but the biggest memrange below 4G for the domain, and b) assign/steal resources from the end of the memrange at the domain level. It would closely mimic what the old code tried to achieve, so seems safe, somehow.
I don't think it is right to assume that the biggest window should always be chosen.
To be clear. I meant to add an optional mode with these features, selectable in Kconfig. The thought was, such a "compatibility mode" should choose the biggest window, because that's what the old allocator did.
Yes, we can start by allocating resources top-down by always selecting the last window and using its top. I already have patches uploaded for this (which need a small change to select the top of last memory): https://review.coreboot.org/c/coreboot/+/41418. But, I don't really want to take that change in because we will continue masking the issues for chipsets/mainboards which were not really doing the right thing. Instead, it would be better to identify such chipsets and fix them.
Right. My current idea is to add a bunch of optional allocator features. On one hand, to allow testing of the existing code. For instance, if a) and b) can be tested individually, that can give us a hint what might be wrong with the chipset code. And on the other hand, to test some new features. I'll have to see how invasive that will be and then either push patches for v4 or start a v4.5 just for experiments ;)
* The above-4G memrange should be aligned to a power of 2 for better MTRR usage (if that isn't the case already?).
All mem resource allocation already happens at 4KiB alignment.
Right, my bad, wrong words, bear with me. I meant, should be rounded up to a power of 2.
Nico
On Fri, May 22, 2020 at 10:01 AM Nico Huber nico.h@gmx.de wrote:
On 19.05.20 02:07, Furquan Shaikh wrote:
On Sun, May 17, 2020 at 2:01 PM Nico Huber nico.h@gmx.de wrote:
I told people that I had something similar in mind. But actually, I don't like it very much. I fear, if we move things aside, they can be left behind and we might soon have to discuss platform deprecation again.
Having both old and new allocators gives us the option to move the chipsets/boards to the older one in case issues are identified even later on and not in the first pass of testing. So, I like the idea of having both allocators in the tree at least for a while.
ok, let's do this.
As mentioned above, the old allocator is currently limited to only the AMD chipsets that are known to have breakages and require more than a quick fix: https://review.coreboot.org/c/coreboot/+/41444. The other chipsets that were identified as problematic last week have all been fixed since: https://review.coreboot.org/c/coreboot/+/41374.
Another reason why I like having both in the tree is it serves as a quick A/B test without anyone having to revert series of patches just to test the old allocator.
That's a great idea. I will likely pick up the opportunity to add more, optional things to test.
It would also be harder to make changes in generic code like device/pci_device when one has to be sure that both allocators work fine.
For the most part, the rest of the code should not really be aware of the internals of any of the allocators. (In practice it is not always true). I understand that we had some assumptions in find_pci_tolm() and how invalid resources are handled. With the following changes: https://review.coreboot.org/c/coreboot/+/41524, those assumptions should no longer be required. We just need a set of rules saying what goes into the allocator and what is the state coming out.
Let's try it. We can put this in Documentation/ once we agree on the rules.
INPUT
The allocator processes, per domain sub-tree, all resources with IORESOURCE_IO or IORESOURCE_MEM set. From here on, these are simply referred to as *resources*.
Resources with the IORESOURCE_FIXED bit set are referred to as *fixed resources*, resources with the IORESOURCE_BRIDGE bit set as *bridge resources*, and resources with the IORESOURCE_ASSIGNED bit set as *assigned resources*.
We should elaborate on the semantics of 'assigned' and its meaning.
Resources with IORESOURCE_IO set must not have IORESOURCE_MEM set.
Resources that should not be changed by the allocator must be fixed resources.
Fixed resources must be assigned resources and have a valid `base` address pre-assigned.
A fixed resource must not overlap with any other fixed resource within the same address space (IO / MEM), unless either of them is a bridge resource.
A fixed bridge resource must not overlap with any other fixed resource within the same address space and on the same bus (i.e. the same device or its siblings), unless exactly one of them has IORESOURCE_SUBTRACTIVE set.
What are you assuming w.r.t. decode priority with SUBTRACTIVE being set?
OUTPUT
All device attributes beside non-fixed resources are left unaltered.
What did you mean by device attributes?
-Aaron
I have pushed a CL to document the discussion here and added more details about the resource allocator as well: https://review.coreboot.org/c/coreboot/+/43603. I have moved questions from Aaron to the CL to continue discussion there. Thanks!
On Wed, May 27, 2020 at 8:58 AM Aaron Durbin adurbin@google.com wrote:
On Fri, May 22, 2020 at 10:01 AM Nico Huber nico.h@gmx.de wrote:
On 19.05.20 02:07, Furquan Shaikh wrote:
On Sun, May 17, 2020 at 2:01 PM Nico Huber nico.h@gmx.de wrote:
I told people that I had something similar in mind. But actually, I don't like it very much. I fear, if we move things aside, they can be left behind and we might soon have to discuss platform deprecation again.
Having both old and new allocators gives us the option to move the chipsets/boards to the older one in case issues are identified even later on and not in the first pass of testing. So, I like the idea of having both allocators in the tree at least for a while.
ok, let's do this.
As mentioned above, the old allocator is currently limited to only the AMD chipsets that are known to have breakages and require more than a quick fix: https://review.coreboot.org/c/coreboot/+/41444. The other chipsets that were identified as problematic last week have all been fixed since: https://review.coreboot.org/c/coreboot/+/41374.
Another reason why I like having both in the tree is it serves as a quick A/B test without anyone having to revert series of patches just to test the old allocator.
That's a great idea. I will likely pick up the opportunity to add more, optional things to test.
It would also be harder to make changes in generic code like device/pci_device when one has to be sure that both allocators work fine.
For the most part, the rest of the code should not really be aware of the internals of any of the allocators. (In practice it is not always true). I understand that we had some assumptions in find_pci_tolm() and how invalid resources are handled. With the following changes: https://review.coreboot.org/c/coreboot/+/41524, those assumptions should no longer be required. We just need a set of rules saying what goes into the allocator and what is the state coming out.
Let's try it. We can put this in Documentation/ once we agree on the rules.
INPUT
The allocator processes, per domain sub-tree, all resources with IORESOURCE_IO or IORESOURCE_MEM set. From here on, these are simply referred to as *resources*.
Resources with the IORESOURCE_FIXED bit set are referred to as *fixed resources*, resources with the IORESOURCE_BRIDGE bit set as *bridge resources*, and resources with the IORESOURCE_ASSIGNED bit set as *assigned resources*.
We should elaborate on the semantics of 'assigned' and its meaning.
Resources with IORESOURCE_IO set must not have IORESOURCE_MEM set.
Resources that should not be changed by the allocator must be fixed resources.
Fixed resources must be assigned resources and have a valid `base` address pre-assigned.
A fixed resource must not overlap with any other fixed resource within the same address space (IO / MEM), unless either of them is a bridge resource.
A fixed bridge resource must not overlap with any other fixed resource within the same address space and on the same bus (i.e. the same device or its siblings), unless exactly one of them has IORESOURCE_SUBTRACTIVE set.
What are you assuming w.r.t. decode priority with SUBTRACTIVE being set?
OUTPUT
All device attributes beside non-fixed resources are left unaltered.
What did you mean by device attributes?
-Aaron