Nico Huber has uploaded this change for review. ( https://review.coreboot.org/c/coreboot/+/37007 )
Change subject: [TEST]cpu/x86/mp_init: Park APs in C7 ......................................................................
[TEST]cpu/x86/mp_init: Park APs in C7
Change-Id: I960871450407ade6e8f18e69921ce74aa7fbca24 Signed-off-by: Nico Huber nico.huber@secunet.com --- M src/cpu/x86/mp_init.c 1 file changed, 16 insertions(+), 1 deletion(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/07/37007/1
diff --git a/src/cpu/x86/mp_init.c b/src/cpu/x86/mp_init.c index 29ae3de..16435fc 100644 --- a/src/cpu/x86/mp_init.c +++ b/src/cpu/x86/mp_init.c @@ -184,7 +184,22 @@
static void park_this_cpu(void *unused) { - stop_this_cpu(); + /* need some address of write-back memory, use AP's stack for now */ + const struct cpu_info *const info = cpu_info(); + const uintptr_t stack_bottom = ALIGN_DOWN((uintptr_t)&_estack, CONFIG_STACK_SIZE) + - (info->index + 1) * CONFIG_STACK_SIZE; + asm volatile( + "xor %%ecx, %%ecx\n\t" + "xor %%edx, %%edx\n" + "__monitor:\n\t" + "mov %0, %%eax\n\t" + "monitor\n\t" + "mov $0x60, %%eax\n\t" + "mwait\n\t" + "jmp __monitor\n\t" + : + : "r" (stack_bottom) + : "eax", "ecx", "edx"); }
/* By the time APs call ap_init() caching has been setup, and microcode has
Werner Zeh has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/37007 )
Change subject: [TEST]cpu/x86/mp_init: Park APs in C7 ......................................................................
Patch Set 1:
(1 comment)
https://review.coreboot.org/c/coreboot/+/37007/1/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/37007/1/src/cpu/x86/mp_init.c@195 PS1, Line 195: mov %0, %%eax\n\t could be xor %%eax, %%eax as done for ecx and edx above.
Nico Huber has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/37007 )
Change subject: [TEST]cpu/x86/mp_init: Park APs in C7 ......................................................................
Patch Set 1:
(1 comment)
https://review.coreboot.org/c/coreboot/+/37007/1/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/37007/1/src/cpu/x86/mp_init.c@195 PS1, Line 195: mov %0, %%eax\n\t
could be […]
heh, this is %0 (aka. stack_bottom) not $0 ;)
Nico Huber has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/37007 )
Change subject: [TEST]cpu/x86/mp_init: Park APs in C7 ......................................................................
Patch Set 1:
This is really just a POC to see if there are measurable effects. There are, btw: BSP can go to much higher turbo states and idling in FILO's UI takes much more power therefore xD
Aaron Durbin has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/37007 )
Change subject: [TEST]cpu/x86/mp_init: Park APs in C7 ......................................................................
Patch Set 1:
(2 comments)
https://review.coreboot.org/c/coreboot/+/37007/1//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/37007/1//COMMIT_MSG@7 PS1, Line 7: C7 Why not the deepest C-state?
https://review.coreboot.org/c/coreboot/+/37007/1/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/37007/1/src/cpu/x86/mp_init.c@187 PS1, Line 187: /* need some address of write-back memory, use AP's stack for now */ if we enable PARALLEL_MP_AP_WORK we can use the callback object as the monitor/mwait address to wait.
We should also do the hinting C-state calculation based on msr info, right?
Tim Wawrzynczak has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/37007 )
Change subject: [TEST]cpu/x86/mp_init: Park APs in C7 ......................................................................
Patch Set 1:
(2 comments)
https://review.coreboot.org/c/coreboot/+/37007/1//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/37007/1//COMMIT_MSG@7 PS1, Line 7: C7
Why not the deepest C-state?
For CML, using 0x60 as the mwait hint is supposed to drop to C10 (substate 1), not C7. These mwait hints seem to be architecture-dependent (although I'd have to look back at older SoCs to see if the hints actually changed meaning or not) see page 169 of Intel #550049.
https://review.coreboot.org/c/coreboot/+/37007/1/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/37007/1/src/cpu/x86/mp_init.c@187 PS1, Line 187: /* need some address of write-back memory, use AP's stack for now */
if we enable PARALLEL_MP_AP_WORK we can use the callback object as the monitor/mwait address to wait […]
Yeah for a "real" implementation, you'd have to check if it supports mwait hints, and what the deepest state it supports, and then what the mwait hint is for that arch. I was literally testing this exact thing yesterday on a CML board and it saves about 10ms of boot time as opposed to the current "hlt" :)
Nico Huber has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/37007 )
Change subject: [TEST]cpu/x86/mp_init: Park APs in C7 ......................................................................
Patch Set 1:
(2 comments)
https://review.coreboot.org/c/coreboot/+/37007/1//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/37007/1//COMMIT_MSG@7 PS1, Line 7: C7
For CML, using 0x60 as the mwait hint is supposed to drop to C10 (substate 1), not C7. These mwait hints seem to be architecture-dependent (although I'd have to look back at older SoCs to see if the hints actually changed meaning or not) see page 169 of Intel #550049.
Indeed, they make it confusing.
Why not the deepest C-state?
C7 is suggested by SKL+ BIOS spec
https://review.coreboot.org/c/coreboot/+/37007/1/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/37007/1/src/cpu/x86/mp_init.c@187 PS1, Line 187: /* need some address of write-back memory, use AP's stack for now */
if we enable PARALLEL_MP_AP_WORK we can use the callback object as the monitor/mwait address to wait.
Thought about that, yes :)
We should also do the hinting C-state calculation based on msr info, right?
Yep. I would start with a Kconfig to opt-in for `mwait` (or maybe make it the default if AMD supports it the same way?). The hint could come from a platform specific hook.
Yeah for a "real" implementation, you'd have to check if it supports mwait hints, and what the deepest state it supports, and then what the mwait hint is for that arch. I was literally testing this exact thing yesterday on a CML board and it saves about 10ms of boot time as opposed to the current "hlt" :)
We could probably gain more if there is something compute-intensive. Currently, the code (cannonlake/cpu.c) tries to set the maximum (turbo) frequency. But that doesn't work because EIST is usually disabled (if you try to enable it, there is a bug in the same file cpu_set_eist() needs to be called one line later). There seems nothing wrong with enabling both EIST and HWP btw. If HWP gets configured later, it should take over.
I'm not sure, though, if the turbo makes much sense. It mostly eats more power ;) And there is also Subrata's case (CB:36865) where an even lower frequency seems to be beneficial.
Stefan Reinauer has abandoned this change. ( https://review.coreboot.org/c/coreboot/+/37007?usp=email )
Change subject: [TEST]cpu/x86/mp_init: Park APs in C7 ......................................................................
Abandoned