Subrata Banik has uploaded this change for review. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
arch/x86: Add postcar_frame_add_ramcache() API
This patch adds new API for intermediate caching top_of_ram and setting up required MTRR for next stage.
Change-Id: Iddafb573afb4799de64754a94816d7f3f2f4982f Signed-off-by: Subrata Banik subrata.banik@intel.com --- M src/arch/x86/include/arch/cpu.h M src/arch/x86/postcar_loader.c 2 files changed, 30 insertions(+), 0 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/05/34805/1
diff --git a/src/arch/x86/include/arch/cpu.h b/src/arch/x86/include/arch/cpu.h index 293ca02..7dc4049 100644 --- a/src/arch/x86/include/arch/cpu.h +++ b/src/arch/x86/include/arch/cpu.h @@ -329,6 +329,13 @@ void postcar_frame_add_romcache(struct postcar_frame *pcf, int type);
/* + * Add variable MTRR covering the Top of RAM with given MTRR type. + */ +void postcar_frame_add_ramcache(struct postcar_frame *pcf, + uintptr_t addr, size_t size, int type); + + +/* * Push used MTRR and Max MTRRs on to the stack * and return pointer to stack top. */ diff --git a/src/arch/x86/postcar_loader.c b/src/arch/x86/postcar_loader.c index 35e139f..8a1f6cb 100644 --- a/src/arch/x86/postcar_loader.c +++ b/src/arch/x86/postcar_loader.c @@ -120,6 +120,29 @@ postcar_frame_add_mtrr(pcf, CACHE_ROM_BASE, CACHE_ROM_SIZE, type); }
+/* + * Make sure we are enabling intermediate caching to speed up next stage + * (postcar/romstage) loading and decompression as we are still in romstage + * and CAR tear down will be handled by next stage at its entry. + */ +static void enable_top_of_ram_cache(uintptr_t base, size_t size) +{ + int mtrr = get_free_var_mtrr(); + + if (mtrr == -1) + return; + + set_var_mtrr(mtrr, base, size, MTRR_TYPE_WRPROT); +} + +void postcar_frame_add_ramcache(struct postcar_frame *pcf, + uintptr_t addr, size_t size, int type) +{ + /* enable intermediate caching for Top of RAM */ + enable_top_of_ram_cache(addr, size); + postcar_frame_add_mtrr(pcf, addr, size, type); +} + void *postcar_commit_mtrrs(struct postcar_frame *pcf) { /*
Patrick Rudolph has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
Patch Set 1:
Please find my comments here: https://review.coreboot.org/c/coreboot/+/34791/5/src/soc/intel/cannonlake/ro...
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
Patch Set 1:
Patch Set 1:
Please find my comments here: https://review.coreboot.org/c/coreboot/+/34791/5/src/soc/intel/cannonlake/ro...
oh yes, i have replied there as well
Arthur Heymans has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
Patch Set 1:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 135: MTRR_TYPE_WRPROT I'm confused. You are writing (copying postcarstage) to that region?
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
Patch Set 1:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 135: MTRR_TYPE_WRPROT
I'm confused. […]
sorry, i didn't get you
Arthur Heymans has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
Patch Set 1:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 135: MTRR_TYPE_WRPROT
sorry, i didn't get you
You set the MTRR to WRPROT to cache a region that you want to write to?
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
Patch Set 1:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 135: MTRR_TYPE_WRPROT
You set the MTRR to WRPROT to cache a region that you want to write to?
yeah, WB can't be satisfied option here due to caching issue. (Data region already marked as WB)
This leads into only 2 option WT or WP.
Do you think WT might be better option than WP?
Write protected (WP) — Reads come from cache lines when possible, and read misses cause cache fills. Writes are propagated to the system bus and cause corresponding cache lines on all processors on the bus to be invalidated. Speculative reads are allowed.
SDM vol 3a, table 11.2
Aaron Durbin has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
Patch Set 1:
(2 comments)
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 135: MTRR_TYPE_WRPROT
yeah, WB can't be satisfied option here due to caching issue. (Data region already marked as WB) […]
It's not marked as WB yet. It should be the same type as passed in below. Remember, this function is only for speeding up *loading* of the next stage. Therefore, all it's doing is adding to the MTRR solution for that specific region. Likewise, it's a part of the MTRR solution for postcar once it tears down and re-enables MTRRs with the newly requested set.
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 138: postcar_frame_add_ramcache This name is weird. I think it should reflect that it's top of 32-bit DRAM usage.
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
Patch Set 1:
(2 comments)
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 135: MTRR_TYPE_WRPROT
It's not marked as WB yet. It should be the same type as passed in below.
0x00000000fef00006: PHYBASE0: Address = 0x00000000fef00000, WB
DCACHE_RAM_BASE is already marked as WB, setting enable_top_of_ram_cache() region again WB, i'm seeing issue. (causes system hang)
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 138: postcar_frame_add_ramcache
This name is weird. I think it should reflect that it's top of 32-bit DRAM usage.
name i will fix it
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
Patch Set 1:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 135: MTRR_TYPE_WRPROT
It's not marked as WB yet. It should be the same type as passed in below. […] I'm seeing issue. (causes system hang)
its actually unable to jump into postcar start (_start)
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
Patch Set 1:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 135: MTRR_TYPE_WRPROT
I'm seeing issue. (causes system hang) […]
looks like the hang was due to the size (16MB) when i'm marking this region as WB
set_var_mtrr(mtrr, base, size, WB);
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
Patch Set 1:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 135: MTRR_TYPE_WRPROT
looks like the hang was due to the size (16MB) when i'm marking this region as WB […]
anything >12MB size is causing
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_ramcache() API ......................................................................
Patch Set 1:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 135: MTRR_TYPE_WRPROT
anything >12MB size is causing
again back into same understanding that marking top_of_ram 16MB as WB is causing hang while jumping into postcar entry.
As per my understanding cache lines are getting invalided as it might be full by this time.
< 12MB is helping because my postcar is loading somewhere >12MB size in top_of_ram hence accessing those memory might not reflect into caching. if the moment i have marked the area where postcar supposed to get loaded (>12MB), i'm seeing hang.
Aaron, any idea what we should do here?
Hello Aaron Durbin, Arthur Heymans, Aamir Bohra, Duncan Laurie, V Sowmya, build bot (Jenkins), Furquan Shaikh,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/34805
to look at the new patch set (#2).
Change subject: arch/x86: Add postcar_frame_add_top_of_ram_cache() API ......................................................................
arch/x86: Add postcar_frame_add_top_of_ram_cache() API
This patch adds new API for intermediate caching top_of_ram and setting up required MTRR for next stage.
Change-Id: Iddafb573afb4799de64754a94816d7f3f2f4982f Signed-off-by: Subrata Banik subrata.banik@intel.com --- M src/arch/x86/include/arch/cpu.h M src/arch/x86/postcar_loader.c 2 files changed, 30 insertions(+), 0 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/05/34805/2
Aaron Durbin has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_top_of_ram_cache() API ......................................................................
Patch Set 2:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/1/src/arch/x86/postcar_loader... PS1, Line 135: MTRR_TYPE_WRPROT
0x00000000fef00006: PHYBASE0: Address = 0x00000000fef00000, WB
That's just CAR region. That has nothing to do with the DRAM address regions. I can't claim to understand the issue you are seeing since there should be no overlapping regions being set. I'm not sure what system you are running on or what type of CAR implementation is being used. Yes, if you set a region that will be used as cacheable and nothing is enabled to ensure no eviction of CAR data lines then things will indeed break.
If the CAR implementation is reliant upon software not over expending the cache size then one can't do this. If it's a full implementation that ensures no eviction then it wouldn't be a problem.
Aaron Durbin has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_top_of_ram_cache() API ......................................................................
Patch Set 2:
(2 comments)
https://review.coreboot.org/c/coreboot/+/34805/2/src/arch/x86/include/arch/c... File src/arch/x86/include/arch/cpu.h:
https://review.coreboot.org/c/coreboot/+/34805/2/src/arch/x86/include/arch/c... PS2, Line 334: cache This isn't a cache, though. What's your reasoning for choosing 'cache' in the name? I also think the comment could be more descriptive as it's more than what is listed.
https://review.coreboot.org/c/coreboot/+/34805/2/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/2/src/arch/x86/postcar_loader... PS2, Line 135: MTRR_TYPE_WRPROT This type is inherently wrong as you'll be writing to this memory for loading purposes. All that it does it ensure writes don't hit the cache.
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_add_top_of_ram_cache() API ......................................................................
Patch Set 2:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/2/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/2/src/arch/x86/postcar_loader... PS2, Line 135: MTRR_TYPE_WRPROT
This type is inherently wrong as you'll be writing to this memory for loading purposes. […]
yes, i will come up with more debug data, right now my observation says, if i make it WB and the moment write appears into this area, it just hangs..
Here is the log from hatch-cml with romstage->postcar->ramstage (existing model)
top_of_ram = 0x9a000000
>>
here we are setting new MTRR 0x0000000099000006: PHYBASE2: Address = 0x0000000099000000, WB 0x0000007fff000800: PHYMASK2: Length = 0x0000000001000000, Valid
>>
MTRR Range: Start=99000000 End=9a000000 (Size 1000000) MTRR Range: Start=ff000000 End=0 (Size 1000000) CBFS: 'VBOOT' located CBFS at [1410000:151f280) CBFS: Locating 'fallback/postcar' CBFS: Found @ offset f1000 size 4688 Decompressing stage fallback/postcar @ 0x99c1ffc0 (34744 bytes) Loading module at 99c20000 with entry 99c20000. filesize: 0x43d8 memsize: 0x8778 Processing 149 relocs. Offset value of 0x97c20000
>>
hang here, unable to execute postcar
>>
but with WP, i don't see such issue, this tells me when ever write operation happens into this new MTRR range, its gone.
in cache_as_ram.S we have actually created direct caching for Data region (RW), marked as WB
#1 With this we were making below region for data (marked as WB, RW enable for caching) 1. 0x00000000fef00006: PHYBASE0: Address = 0x00000000fef00000, WB 0x0000007ffffc0800: PHYMASK0: Length = 0x0000000000040000, Valid
And 1 code (RO) region marked as WP #2 0x00000000ff000005: PHYBASE1: Address = 0x00000000ff000000, WP 0x0000007fff000800: PHYMASK1: Length = 0x0000000001000000, Valid
Need to understand if there any limitation of marking more than 1 region as WB as we are in NEM enhanced mode and already cached #1 above region before enabling NEM enhanced mode ?
Hello Aaron Durbin, Arthur Heymans, Aamir Bohra, Duncan Laurie, V Sowmya, build bot (Jenkins), Furquan Shaikh,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/34805
to look at the new patch set (#3).
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
arch/x86: Add postcar_frame_setup_top_of_dram_usage() API
This patch adds new API for intermediate caching top_of_ram and setting up required MTRR for next stage.
Change-Id: Iddafb573afb4799de64754a94816d7f3f2f4982f Signed-off-by: Subrata Banik subrata.banik@intel.com --- M src/arch/x86/include/arch/cpu.h M src/arch/x86/postcar_loader.c 2 files changed, 34 insertions(+), 0 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/05/34805/3
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
Patch Set 3:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/2/src/arch/x86/include/arch/c... File src/arch/x86/include/arch/cpu.h:
https://review.coreboot.org/c/coreboot/+/34805/2/src/arch/x86/include/arch/c... PS2, Line 334: cache
This isn't a cache, though. […]
Done
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
Patch Set 3:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/2/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/2/src/arch/x86/postcar_loader... PS2, Line 135: MTRR_TYPE_WRPROT
yes, i will come up with more debug data, right now my observation says, if i make it WB and the mom […]
After talking to cache architect, i'm getting confirmation about NEM mode
here is problem statement why we have introduced NEM enhance mode:
Google Chromebook BIOS doesn’t fit the provided LLC size (can be as small as 1.5M for the product with 2 cores and 1M LLC ) Their request: Code size (RO): up to 16M Data size (RW): up to 256K
Idea: If we could make sure that data always remains in cache while code line are get replaced – we could allow unlimited code size along with the well defined limited data size
New request: now we are asking to expand data size with newly DRAM range (top of DRAM) and cache this region as well in another MTRR with WB
Yes, I think it’s not supposed to work. MTRR’s for NEM mode should be set up before enabling the NEM mode via MSR. You cannot add another WB region (DRAM) while already in NEM mode.
interestingly marking this area WB vs WP, i don't see much savings in boot time. Its ~1-1.5ms delta between two
Patrick Rudolph has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
Patch Set 4:
My test results for KBL: memsetting 1GiB of memory with different MTRR settings in NEM right after raminit:
do_fsp_post_memory_init: took 9180 msec to clear 1GiB UC do_fsp_post_memory_init: took 9179 msec to clear 1GiB WPROT do_fsp_post_memory_init: took 39 msec to clear 1GiB WRCOMB do_fsp_post_memory_init: took 76 msec to clear 1GiB WRBACK
The use of WRBACK in NEM causes postcar stage loading to fail. The use of WRCOMB seems fine, it still boots.
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
Patch Set 4:
Patch Set 4:
My test results for KBL: memsetting 1GiB of memory with different MTRR settings in NEM right after raminit:
do_fsp_post_memory_init: took 9180 msec to clear 1GiB UC do_fsp_post_memory_init: took 9179 msec to clear 1GiB WPROT do_fsp_post_memory_init: took 39 msec to clear 1GiB WRCOMB do_fsp_post_memory_init: took 76 msec to clear 1GiB WRBACK
Appreciate your help Patrick for the data using this CL and analysis.
The use of WRBACK in NEM causes postcar stage loading to fail.
yes, me and Aaron are debugging this issue with cache architect on CML (behavior should be same across), the symptoms that we are seeing is that "invd" is clearing LLC cache line completely when marked as WB but WP/WC doesn't create problem although "invd" should clear entire LLC cache line. As per our understanding this is expected behavior of WB cache vs other type. Hence we are working on alternative ways to achieve the same. We have some sight but still under debug.
The use of WRCOMB seems fine, it still boots.
yes, WC and WP will work for sure. verified that part.
Hello Aaron Durbin, Arthur Heymans, Aamir Bohra, Duncan Laurie, V Sowmya, build bot (Jenkins), Furquan Shaikh,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/34805
to look at the new patch set (#5).
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
arch/x86: Add postcar_frame_setup_top_of_dram_usage() API
This patch adds new API for intermediate caching top_of_ram and setting up required MTRR for next stage.
Change-Id: Iddafb573afb4799de64754a94816d7f3f2f4982f Signed-off-by: Subrata Banik subrata.banik@intel.com --- M src/arch/x86/include/arch/cpu.h M src/arch/x86/postcar_loader.c 2 files changed, 41 insertions(+), 0 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/05/34805/5
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
Patch Set 5:
Patch Set 4:
My test results for KBL: memsetting 1GiB of memory with different MTRR settings in NEM right after raminit:
do_fsp_post_memory_init: took 9180 msec to clear 1GiB UC do_fsp_post_memory_init: took 9179 msec to clear 1GiB WPROT do_fsp_post_memory_init: took 39 msec to clear 1GiB WRCOMB do_fsp_post_memory_init: took 76 msec to clear 1GiB WRBACK
The use of WRBACK in NEM causes postcar stage loading to fail. The use of WRCOMB seems fine, it still boots.
HI Patrick, are you really sure that WC took least time between all MTRR types. As per SDM, WC is not for read/write cacheable. btw, which address range you tried ?
Patrick Rudolph has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
Patch Set 5:
Patch Set 5:
Patch Set 4:
My test results for KBL: memsetting 1GiB of memory with different MTRR settings in NEM right after raminit:
do_fsp_post_memory_init: took 9180 msec to clear 1GiB UC do_fsp_post_memory_init: took 9179 msec to clear 1GiB WPROT do_fsp_post_memory_init: took 39 msec to clear 1GiB WRCOMB do_fsp_post_memory_init: took 76 msec to clear 1GiB WRBACK
The use of WRBACK in NEM causes postcar stage loading to fail. The use of WRCOMB seems fine, it still boots.
HI Patrick, are you really sure that WC took least time between all MTRR types. As per SDM, WC is not for read/write cacheable. btw, which address range you tried ?
Yes WC was the fastest, it's measured using timer_monotonic_get(). I memset memory from 32MiB to 1GiB+32MiB. I guess WC is best for writing large chunks of memory. For postcar caching it might not be the best choice.
I'll do more tests on older platforms as I remember that those behave in a different way.
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
Patch Set 5:
Patch Set 5:
Patch Set 5:
Patch Set 4:
My test results for KBL: memsetting 1GiB of memory with different MTRR settings in NEM right after raminit:
do_fsp_post_memory_init: took 9180 msec to clear 1GiB UC do_fsp_post_memory_init: took 9179 msec to clear 1GiB WPROT do_fsp_post_memory_init: took 39 msec to clear 1GiB WRCOMB do_fsp_post_memory_init: took 76 msec to clear 1GiB WRBACK
The use of WRBACK in NEM causes postcar stage loading to fail. The use of WRCOMB seems fine, it still boots.
HI Patrick, are you really sure that WC took least time between all MTRR types. As per SDM, WC is not for read/write cacheable. btw, which address range you tried ?
Yes WC was the fastest, it's measured using timer_monotonic_get(). I memset memory from 32MiB to 1GiB+32MiB. I guess WC is best for writing large chunks of memory. For postcar caching it might not be the best choice.
I'll do more tests on older platforms as I remember that those behave in a different way.
Thanks for prompt reply.
For postcar caching it might not be the best choice.
This is my understanding as well. i could see WP is more applicable in terms of savings than WC. Best optimization is with WB.
boot time numbers are
romstage -> ramstage [with intermediate caching as WB] = ~810ms romstage -> ramstage [with intermediate caching as WP] = ~820ms romstage -> ramstage [with intermediate caching as WC] = ~880ms
We are able to make WB working as intermediate caching with some code changes and assumptions. (without those hang as you might have seen earlier with WB). But we (me and Aaron) are not 100% sure if we should make those changes by the assumptions to make WB work.
Hello Aaron Durbin, Arthur Heymans, Aamir Bohra, Duncan Laurie, V Sowmya, build bot (Jenkins), Furquan Shaikh,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/34805
to look at the new patch set (#6).
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
arch/x86: Add postcar_frame_setup_top_of_dram_usage() API
This patch adds new API for intermediate caching top_of_ram and setting up required MTRR for next stage.
Change-Id: Iddafb573afb4799de64754a94816d7f3f2f4982f Signed-off-by: Subrata Banik subrata.banik@intel.com --- M src/arch/x86/include/arch/cpu.h M src/arch/x86/postcar_loader.c 2 files changed, 39 insertions(+), 0 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/05/34805/6
Kyösti Mälkki has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
Patch Set 6:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/6/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/6/src/arch/x86/postcar_loader... PS6, Line 162: postcar_frame_add_mtrr(pcf, addr, size, type); Implementation of postcar_frame_add_mtrr() is capable of splitting the region to multiple MTRRs. Implementation of enable_top_of_dram_cache() is not. The approach proposed here adds new alignment requirements, so you cannot simply change calls from postcar_frame_add_mtrr() to postcar_frame_setup_top_of_dram_usage(), like your followup work seems to propose.
I believe that as we discussed dropping CPU_ADDR_BITS and cpu_phys_address_size() I recommended not to use set_var_mtrr() but to replay the values from the generated frame instead.
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
Patch Set 6:
(1 comment)
https://review.coreboot.org/c/coreboot/+/34805/6/src/arch/x86/postcar_loader... File src/arch/x86/postcar_loader.c:
https://review.coreboot.org/c/coreboot/+/34805/6/src/arch/x86/postcar_loader... PS6, Line 162: postcar_frame_add_mtrr(pcf, addr, size, type);
Implementation of postcar_frame_add_mtrr() is capable of splitting the region to multiple MTRRs. Implementation of enable_top_of_dram_cache() is not.
enable_top_of_dram_cache() is to just enable intermediate caching, anyway the range we are passing into enable_top_of_dram_cache() always existed into postcar frame so we shouldn't be worried for next stage as post car frame will pass the number of MTRR to reset and set while setting up next stage stack top/setting up MTRR range.
The approach proposed here adds new alignment requirements, so you cannot simply change calls from postcar_frame_add_mtrr() to postcar_frame_setup_top_of_dram_usage(), like your followup work seems to propose.
postcar_frame_setup_top_of_dram_usage() functionally will additionally enable caching for top of DRAM along with existing postcar_frame_add_mtrr() work what it was doing previously.
I believe that as we discussed dropping CPU_ADDR_BITS and cpu_phys_address_size() I recommended not to use set_var_mtrr() but to replay the values from the generated frame instead.
set_var_mtrr() been used here to enable immediate caching of give range. if we had to relying on "generated frame", it would be taken care during next stage entry point while setting up new MTRR range. In that way, we won't be getting any benefit as mentioned here.
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
Patch Set 6:
Patch Set 5:
Patch Set 5:
Patch Set 5:
Patch Set 4:
My test results for KBL: memsetting 1GiB of memory with different MTRR settings in NEM right after raminit:
do_fsp_post_memory_init: took 9180 msec to clear 1GiB UC do_fsp_post_memory_init: took 9179 msec to clear 1GiB WPROT do_fsp_post_memory_init: took 39 msec to clear 1GiB WRCOMB do_fsp_post_memory_init: took 76 msec to clear 1GiB WRBACK
The use of WRBACK in NEM causes postcar stage loading to fail. The use of WRCOMB seems fine, it still boots.
HI Patrick, are you really sure that WC took least time between all MTRR types. As per SDM, WC is not for read/write cacheable. btw, which address range you tried ?
Yes WC was the fastest, it's measured using timer_monotonic_get(). I memset memory from 32MiB to 1GiB+32MiB. I guess WC is best for writing large chunks of memory. For postcar caching it might not be the best choice.
I'll do more tests on older platforms as I remember that those behave in a different way.
Thanks for prompt reply.
For postcar caching it might not be the best choice.
This is my understanding as well. i could see WP is more applicable in terms of savings than WC. Best optimization is with WB.
boot time numbers are
romstage -> ramstage [with intermediate caching as WB] = ~810ms romstage -> ramstage [with intermediate caching as WP] = ~820ms romstage -> ramstage [with intermediate caching as WC] = ~880ms
We are able to make WB working as intermediate caching with some code changes and assumptions. (without those hang as you might have seen earlier with WB). But we (me and Aaron) are not 100% sure if we should make those changes by the assumptions to make WB work.
I have made intermediate DRAM caching range as WP based on our experiments. Although i will still submit the require changes to enable DRAM ranges WB as well.
Subrata Banik has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/34805 )
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
Patch Set 6:
Patch Set 6:
Patch Set 5:
Patch Set 5:
Patch Set 5:
Patch Set 4:
My test results for KBL: memsetting 1GiB of memory with different MTRR settings in NEM right after raminit:
do_fsp_post_memory_init: took 9180 msec to clear 1GiB UC do_fsp_post_memory_init: took 9179 msec to clear 1GiB WPROT do_fsp_post_memory_init: took 39 msec to clear 1GiB WRCOMB do_fsp_post_memory_init: took 76 msec to clear 1GiB WRBACK
The use of WRBACK in NEM causes postcar stage loading to fail. The use of WRCOMB seems fine, it still boots.
HI Patrick, are you really sure that WC took least time between all MTRR types. As per SDM, WC is not for read/write cacheable. btw, which address range you tried ?
Yes WC was the fastest, it's measured using timer_monotonic_get(). I memset memory from 32MiB to 1GiB+32MiB. I guess WC is best for writing large chunks of memory. For postcar caching it might not be the best choice.
I'll do more tests on older platforms as I remember that those behave in a different way.
Thanks for prompt reply.
For postcar caching it might not be the best choice.
This is my understanding as well. i could see WP is more applicable in terms of savings than WC. Best optimization is with WB.
boot time numbers are
romstage -> ramstage [with intermediate caching as WB] = ~810ms romstage -> ramstage [with intermediate caching as WP] = ~820ms romstage -> ramstage [with intermediate caching as WC] = ~880ms
We are able to make WB working as intermediate caching with some code changes and assumptions. (without those hang as you might have seen earlier with WB). But we (me and Aaron) are not 100% sure if we should make those changes by the assumptions to make WB work.
I have made intermediate DRAM caching range as WP based on our experiments. Although i will still submit the require changes to enable DRAM ranges WB as well.
https://review.coreboot.org/q/topic:%22dram_cache_wb%22+(status:open%20OR%20...) approach to make WB work
Hello Aaron Durbin, Arthur Heymans, Aamir Bohra, Duncan Laurie, V Sowmya, build bot (Jenkins), Furquan Shaikh,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/34805
to look at the new patch set (#7).
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
arch/x86: Add postcar_frame_setup_top_of_dram_usage() API
This patch adds new API for intermediate caching top_of_ram and setting up required MTRR for next stage.
Change-Id: Iddafb573afb4799de64754a94816d7f3f2f4982f Signed-off-by: Subrata Banik subrata.banik@intel.com --- M src/arch/x86/include/arch/cpu.h M src/arch/x86/postcar_loader.c 2 files changed, 39 insertions(+), 0 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/05/34805/7
Hello Aaron Durbin, Arthur Heymans, Aamir Bohra, Duncan Laurie, V Sowmya, build bot (Jenkins), Furquan Shaikh,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/34805
to look at the new patch set (#9).
Change subject: arch/x86: Enable intermediate caching top_of_ram ......................................................................
arch/x86: Enable intermediate caching top_of_ram
This patch enables caching for the (top_of_dram - size) to top_of_dram as WP.
Also helps to save additional ~2ms of booting time in normal boot on CML-hatch.
Change-Id: Iddafb573afb4799de64754a94816d7f3f2f4982f Signed-off-by: Subrata Banik subrata.banik@intel.com --- M src/arch/x86/include/arch/cpu.h M src/arch/x86/postcar_loader.c 2 files changed, 26 insertions(+), 0 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/05/34805/9
Hello Aaron Durbin, Arthur Heymans, Aamir Bohra, Duncan Laurie, V Sowmya, build bot (Jenkins), Furquan Shaikh,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/34805
to look at the new patch set (#10).
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
arch/x86: Add postcar_frame_setup_top_of_dram_usage() API
This patch adds new API for intermediate caching top_of_ram and setting up required MTRR for next stage.
Change-Id: Iddafb573afb4799de64754a94816d7f3f2f4982f Signed-off-by: Subrata Banik subrata.banik@intel.com --- M src/arch/x86/include/arch/romstage.h M src/arch/x86/postcar_loader.c 2 files changed, 40 insertions(+), 0 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/05/34805/10
Stefan Reinauer has abandoned this change. ( https://review.coreboot.org/c/coreboot/+/34805?usp=email )
Change subject: arch/x86: Add postcar_frame_setup_top_of_dram_usage() API ......................................................................
Abandoned