Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34791 )
Change subject: soc/intel/cannonlake: Speed up postcar loading using intermediate caching ......................................................................
Patch Set 12:
Patch Set 12:
Agree, such big cbmem -t data are confusing. Let me summarize it here with data.
- Baseline number without any code changes:
Total Time till picking kernel: 823,964 Total Time till picking payload: 651,510
- With POSTCAR_STAGE=y and (CB:34805 + 34995) [romstage -> postcar -> ramstage]
Total Time till picking kernel: 813,549 Total Time till picking payload: 635,250
CB:34995, adding TSEG WB is the least controversial and something that would get easily merged. It is also roughly half (~7 ms) of #1 vs #2 improvement (reaching payload) and needs to appear alone as the first commit of your patch train.
Aug 19 13:27 1100:finished vboot kernel verification 828,339 (137,616) Total Time: 818,805
Aug 20 09:55 1100:finished vboot kernel verification 823,294 (147,272) Total Time: 813,549
The reference 813,549 came from a boot with 10 ms slower 1100: step. I cannot see how 1100 would be affected of any MTRR changes made in late romstage, but maybe I am mistaken about it.
I'm guessing caching might have some role to play here as we see those numbers (1100:finished vboot kernel verification) are consistently remain different in 3 approach (base line, postcar + those CB CL, ramstage + those CB CL).
Could be something about SSD and how you power-cycle/reboot that triggers a seemingly constant 10ms difference for 1100. I just don't know.
I am speculating some boots of #2 could have been 10ms better for 1100, giving us: 'Total Time till picking kernel: ~803,000'.
I would also expect the same but i do see consistent number in long cycling test as well with #2 and #3 approach.
- With POSTCAR_STAGE=n and (CB:34995 + CB:34791 + CB:34752) [romstage -> ramstage]
Total Time till picking kernel: 810,332 Total Time till picking payload: 642,655
Now if you compare the end to end time between #2 and #3, we will seeing savings ~3ms+ (as you have pointed rightly "Notice how your fastest entry to kernel had the slowest 1100:finished vboot kernel verification.") savings in #3 approach consistently. All the platform performance scripts we do run to meet product compliance are relying on "Total Time till picking kernel". So if we compare as below
If I am correct in my comment about 1100: vboot kernel verification step, #2 POSTCAR_STAGE=y beats #3 POSTCAR_STAGE=n by 7 ms in both "Total Time till picking kernel" and "Total Time till picking payload".
yes, as per current data and analysis been done so far, #2 POSTCAR_STAGE=y beats #3 POSTCAR_STAGE=n by 7 ms in "Total Time till picking payload" for sure