Aaron Durbin has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34476 )
Change subject: Rampayload: Attempt to boot coreboot without ramstage ......................................................................
Patch Set 5:
Patch Set 5:
Patch Set 5:
Patch Set 5:
Patch Set 5:
What are the limitations of romstage/ramstage pair that this is trying to solve?
we can remove romstage from the scope of coreboot-lite/rampaylaod work discussion as romstage is going to stay here for sure
So are we packing in memory training and other things into this new stage. i.e. combine romstage and ramstage?
We might don't want to disturb romstage if we have postcar existed, then basically romstage -> postcar -> kernel
From a high level, that's how I read what you are attempting, but there's lots of weird details there so I don't think that's happening. I still think we're doing romstge -> newstage -> kernel
you are right that postcar might be extended in feature to pull in required functions from ramstage to boot to OS. Like MP init, ASL generation
what we are trying to achieve is that avoid loading 1 dedicated stage (i.e ramstage here) and if we can just pull in required functionality into previous stage for booting a platform.
So far we have seen ~240ms of time saving in this approach with some known WA.
Can you quantify where the savings is coming from? We should have an idea what is the source of the savings because we could focus on improving that aspect in ramstage.
yes, i have those break down, i will share time details to you over email as i can't attach .log files here with RAMPAYLOAD enable.
I think that analysis should be documented to live on as well as serve as a basis for the approach as a whole. I'm still not convinced that the approach we're taking is the right way.
migrating to rampayload might be long pole for chrome platform but we might think about some use case for coreboot to compete with ABL, where bootloader has to be more thinner and meet certain boot time restriction.
As I noted before, you'll be pulling in almost all of the functionality from ramstage in order to get a system proper configured.
i'm also afraid of same situation but so far if i look at my POC patch for reference. From POC CL to here, i have almost ported required functionalities like MP init, AML generation into this CL.
Now next steps would be thinking about possible solutions to make dynamic BAR assignment of limited PCI resource (Input, Output and boot device) and see how do we create run time AML for peripheral devices (touch, tpm etc). In this process we might need to compile FSP call into previous stage (postcar here)
I think it'd be instructive to do a side by side comparison of our "traditional" boot flow along w/ the actions/details each stage is performing against your target boot flow. There are high level semantics that are needed to be employed. e.g. CAR tear down for a clean program environment, etc. Those building blocks would ideally be able to be moved around to form different boot flows. However, I think we should understand the current limitations and see if we can improve those before going down a more complicated solution. e.g. do you save time by tearing down CAR in ramstage itself? i.e. prologue of tearing down CAR is in first part of ramstage.
if i'm not wrong, we are tearning down the CAR in postcar for IA platform and i don't see much savings if we don't tear the CAR in postcar.
That's not what I'm saying. My mental model is that you are slowly adding in features/properties of ramstage into postcar as you realize you need them. As such this comes back to where the savings in boot time is coming from. If we just bolted on CAR teardown in ramstage entry does that reap the gains that you are seeing? If not, what other low hanging fruit is there?
at high level i think 2 place where definitely we could improve would be
- Avoid entire PCI enumeration and setup
At what cost? Putting in lots of device specific code to program bars as needed? Similarly, can't this be achieved by tweaking the exiting boot flow? In the end it has to be an option because one cannot rely on the eventual kernel/payload performing the actions you are trying to defer here.
- Executing any stage has its own problem of finding from cbfs, decompressesion and loading into memory.
But this is a solved problem. All stages know how to load next program.
This approach actually tries to broadly avoid these 2 things in ramstage and achieve the same via postcar itself