<- snip ->
> >
> > I think that analysis should be documented to live on as well as serve as a basis for the approach as a whole. I'm still not convinced that the approach we're taking is the right way.
>
> Sure i will add a live document link to capture this.

I just meant it should be in some documentation in the tree as well as the commit description.


<- snip ->
>
> > If not, what other low hanging fruit is there?
>
> As i told low hanging fruits would be limiting the time that we are spending during PCI enumeration and setting up the resources. I would advocate about limiting the functionality in ramstage as well (without introducing RAMPAYLOAD concept), if i could skip entire PCI tree enumeration and do in a fixed manner means a way where user provides list of PCI devices (might have from BIOS side to boot to paylaod/kernel) based on some Kconfig options. Ultimately BIOS is only bother about fixing some chipset WA/recommended programming and setting up BAR to communicate with devices to boot from.
>
> Today entire PCI enumeration takes ~160ms+ time, which we can limit and still could able to boot to kernel.
>
> 30:device enumeration 466,492 (50,800)
> 40:device configuration 589,305 (21,695)
> 50:device enable 638,470 (258)
> 60:device initialization 665,478 (27,007)
> 70:device setup done 755,515 (90,037)
> 75:cbmem post 756,226 (711)
> 80:write tables 756,437 (210)
>
> Also loading any additional stage takes some time. Just loading fallback/ramstage will take ~30ms
>
> 8:starting to load ramstage 383,195 (138)
> 15:starting LZMA decompress (ignore for x86) 383,210 (14)
> 16:finished LZMA decompress (ignore for x86) 408,031 (24,820)
> 9:finished loading ramstage 415,242 (7,211)
> 550:starting to load Chrome OS VPD 415,321 (78)
> 10:start of ramstage 415,692 (370)
>

<- snip ->

It seems you've identified a feature you want to add that reduces PCI enumeration and init costs (or speed up the current approach). That's a way different direction than trying to recreate ramstage on a piecemeal basis. Given what you are indicating that seems like the better approach.

FWIW, putting CAR teardown on the front of ramstage would get rid of the extra stage load. That was my point.


At what cost? Putting in lots of device specific code to program bars as needed? Similarly, can't this be achieved by tweaking the exiting boot flow? In the end it has to be an option because one cannot rely on the eventual kernel/payload performing the actions you are trying to defer here.

I guess the concern would be how do we limit our FW space and make FW doing minimum then what is doing today. You are right that skipping entire PCI enumeration in BIOS space means it will happen inside kernel (if kernel has provision) and it might add up to same time (if we bother about end to end booting time). But the counter argument would be why to bloat BIOS with an already existed kernel feature and user won't love to see more time spent on FW space and might interested to see OS UI as soon as power on the device. if we could save ~200ms+ time using this approach then we would be closer to ~600ms of booting time (per with ABL standard even with CB without making any HW BOM change)

This is a new feature as well. Providing BARs to statically allocate and slam in for initialization in the boot flow. I definitely don't think we should be manually solving such a thing though. i.e. writing specific code for initializing BARs. It should be done at build time and instructions for which BAR/register to init should just be executed. That is a more scalable approach if one wanted to go down this path.


Agree that there might be some price to pay for that and it could have done using existing ramstage way as well (Furquan had requested for the same and i have provided him the time estimation). Based on above data, you can say, dropping fallback/ramstage might saves you ~40ms of boot time and ~150kB * 3 copies = ~450kB spi footprint reduction over doing everything on top of ramstage. (without rampayload)

Again, this is another goal: reduce size of ramstage. Great. Where's the analysis with the break down of code size? And why aren't we targeting ways to reduce that directly in ramstage?

View Change

To view, visit change 34476. To unsubscribe, or for help writing mail filters, visit settings.

Gerrit-Project: coreboot
Gerrit-Branch: master
Gerrit-Change-Id: I22994c11317cf6936828c07fcac2cf14fca8a74b
Gerrit-Change-Number: 34476
Gerrit-PatchSet: 5
Gerrit-Owner: Subrata Banik <subrata.banik@intel.com>
Gerrit-Reviewer: Aaron Durbin <adurbin@chromium.org>
Gerrit-Reviewer: Julius Werner <jwerner@chromium.org>
Gerrit-Reviewer: Martin Roth <martinroth@google.com>
Gerrit-Reviewer: Patrick Georgi <pgeorgi@google.com>
Gerrit-Reviewer: Subrata Banik <subrata.banik@intel.com>
Gerrit-Reviewer: build bot (Jenkins) <no-reply@coreboot.org>
Gerrit-Reviewer: ron minnich <rminnich@gmail.com>
Gerrit-CC: Furquan Shaikh <furquan@google.com>
Gerrit-CC: Paul Menzel <paulepanter@users.sourceforge.net>
Gerrit-Comment-Date: Thu, 25 Jul 2019 03:23:32 +0000
Gerrit-HasComments: No
Gerrit-Has-Labels: No
Gerrit-MessageType: comment