Jonathan Zhang has uploaded this change for review. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: Make MP init timeout configurable ......................................................................
Make MP init timeout configurable
The current MP init timeout is hardcoded as 1s. To support Intel Xeon scalable processor, the timeout needs to be adjusted.
This patch makes MP init timeout configurable.
Signed-off-by: Jonathan Zhang jonzhang@fb.com Signed-off-by: Reddy Chagam anjaneya.chagam@intel.com Change-Id: Ibc079fc6aa8641d4ac8d8e726899b6c8d055052e --- M src/cpu/x86/Kconfig M src/cpu/x86/mp_init.c 2 files changed, 7 insertions(+), 1 deletion(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/46/38546/1
diff --git a/src/cpu/x86/Kconfig b/src/cpu/x86/Kconfig index 76446a0..eedd2e6 100644 --- a/src/cpu/x86/Kconfig +++ b/src/cpu/x86/Kconfig @@ -184,3 +184,9 @@ help The SoC requires different access methods for reading and writing the MSRs. Use SoC specific routines to handle the MSR access. + +config FLIGHT_PLAN_TIMEOUT_US + int + default 1000000 + help + Time-out (in micro seconds) to wait for APs to check-in diff --git a/src/cpu/x86/mp_init.c b/src/cpu/x86/mp_init.c index 45776f8..e874931 100644 --- a/src/cpu/x86/mp_init.c +++ b/src/cpu/x86/mp_init.c @@ -522,7 +522,7 @@ * could take a longer time for APs to check-in as the number of APs * increases (contention for resources like UART also increases). */ - const int timeout_us = 1000000; + const int timeout_us = CONFIG_FLIGHT_PLAN_TIMEOUT_US; const int step_us = 100; int num_aps = mp_params->num_cpus - 1; struct stopwatch sw;
Nico Huber has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: Make MP init timeout configurable ......................................................................
Patch Set 1:
(2 comments)
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c@523 PS1, Line 523: * increases (contention for resources like UART also increases). Comment needs update.
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c@525 PS1, Line 525: const int timeout_us = CONFIG_FLIGHT_PLAN_TIMEOUT_US; Why not simply increase the timeout for all targets instead of adding a Kconfig. We should never expect an AP to stall, should we?
Aaron Durbin has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: Make MP init timeout configurable ......................................................................
Patch Set 1:
(1 comment)
https://review.coreboot.org/c/coreboot/+/38546/1//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/38546/1//COMMIT_MSG@10 PS1, Line 10: Xeon scalable processor, the timeout needs to be adjusted. What sort of timing are you seeing and why? It could certainly be made to be larger. I find 1s timeout to be pretty large so it'd be good to know the root of the issue.
Paul Menzel has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: Make MP init timeout configurable ......................................................................
Patch Set 1:
(3 comments)
No Kconfig option is needed, just increase the time-out.
https://review.coreboot.org/c/coreboot/+/38546/1//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/38546/1//COMMIT_MSG@7 PS1, Line 7: Make MP init timeout configurable Please add a prefix:
cpu/x86: …
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/Kconfig File src/cpu/x86/Kconfig:
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/Kconfig@192 PS1, Line 192: Time-out (in micro seconds) to wait for APs to check-in 1. No second tab. 2. Dot/period at the end please.
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c@525 PS1, Line 525: const int timeout_us = CONFIG_FLIGHT_PLAN_TIMEOUT_US;
Why not simply increase the timeout for all targets instead of adding […]
Agreed.
Jonathan Zhang has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: Make MP init timeout configurable ......................................................................
Patch Set 1:
(4 comments)
https://review.coreboot.org/c/coreboot/+/38546/1//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/38546/1//COMMIT_MSG@7 PS1, Line 7: Make MP init timeout configurable
Please add a prefix: […]
Ack
https://review.coreboot.org/c/coreboot/+/38546/1//COMMIT_MSG@10 PS1, Line 10: Xeon scalable processor, the timeout needs to be adjusted.
What sort of timing are you seeing and why? It could certainly be made to be larger. […]
We are working on Xeon-SP based server platform. The multi-socket bring-up and communication is quite different and more complex than other SoC families. Hence we need the MP init timeout to be much larger.
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/Kconfig File src/cpu/x86/Kconfig:
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/Kconfig@192 PS1, Line 192: Time-out (in micro seconds) to wait for APs to check-in
- No second tab. […]
Done
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c@525 PS1, Line 525: const int timeout_us = CONFIG_FLIGHT_PLAN_TIMEOUT_US;
Agreed.
It is oaky for me to simply increase the timeout for all targets, if this is fine for the community. The other targets may want to cap the timeout to a much lower value than Xeon-SP server platform, this is their choice.
Hello Aaron Durbin, David Hendricks, build bot (Jenkins), Anjaneya "Reddy" Chagam,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/38546
to look at the new patch set (#2).
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
cpu/x86: Make MP init timeout configurable
The current MP init timeout is hardcoded as 1s. To support Intel Xeon scalable processor, the timeout needs to be adjusted.
This patch makes MP init timeout configurable.
Signed-off-by: Jonathan Zhang jonzhang@fb.com Signed-off-by: Reddy Chagam anjaneya.chagam@intel.com Change-Id: Ibc079fc6aa8641d4ac8d8e726899b6c8d055052e --- M src/cpu/x86/Kconfig M src/cpu/x86/mp_init.c 2 files changed, 8 insertions(+), 2 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/46/38546/2
Hello Aaron Durbin, David Hendricks, build bot (Jenkins), Anjaneya "Reddy" Chagam,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/38546
to look at the new patch set (#3).
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
cpu/x86: Make MP init timeout configurable
The current MP init timeout is hardcoded as 1s. To support Intel Xeon scalable processor, the timeout needs to be adjusted.
This patch makes MP init timeout configurable.
Signed-off-by: Jonathan Zhang jonzhang@fb.com Signed-off-by: Reddy Chagam anjaneya.chagam@intel.com Change-Id: Ibc079fc6aa8641d4ac8d8e726899b6c8d055052e --- M src/cpu/x86/Kconfig M src/cpu/x86/mp_init.c 2 files changed, 8 insertions(+), 2 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/46/38546/3
Hello Aaron Durbin, David Hendricks, build bot (Jenkins), Anjaneya "Reddy" Chagam,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/38546
to look at the new patch set (#4).
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
cpu/x86: Make MP init timeout configurable
The current MP init timeout is hardcoded as 1s. To support Intel Xeon scalable processor, the timeout needs to be adjusted.
This patch makes MP init timeout configurable.
Signed-off-by: Jonathan Zhang jonzhang@fb.com Signed-off-by: Reddy Chagam anjaneya.chagam@intel.com Change-Id: Ibc079fc6aa8641d4ac8d8e726899b6c8d055052e --- M src/cpu/x86/Kconfig M src/cpu/x86/mp_init.c 2 files changed, 8 insertions(+), 2 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/46/38546/4
Hello Aaron Durbin, David Hendricks, build bot (Jenkins), Anjaneya "Reddy" Chagam,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/38546
to look at the new patch set (#5).
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
cpu/x86: Make MP init timeout configurable
The current MP init timeout is hardcoded as 1s. To support Intel Xeon scalable processor, the timeout needs to be adjusted.
This patch makes MP init timeout configurable.
Signed-off-by: Jonathan Zhang jonzhang@fb.com Signed-off-by: Reddy Chagam anjaneya.chagam@intel.com Change-Id: Ibc079fc6aa8641d4ac8d8e726899b6c8d055052e --- M src/cpu/x86/Kconfig M src/cpu/x86/mp_init.c 2 files changed, 8 insertions(+), 2 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/46/38546/5
Aaron Durbin has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 5:
(1 comment)
https://review.coreboot.org/c/coreboot/+/38546/1//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/38546/1//COMMIT_MSG@10 PS1, Line 10: Xeon scalable processor, the timeout needs to be adjusted.
We are working on Xeon-SP based server platform. […]
Are you saying that multi socket cpus coming online takes 10 secs? I don't see a reason aside from "different" and "complex" which doesn't explain the details. Isn't one of the node bsps already up and trained memory by now? Which cpu in the other socket is taking so long?
Jonathan Zhang has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 5:
(1 comment)
https://review.coreboot.org/c/coreboot/+/38546/1//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/38546/1//COMMIT_MSG@10 PS1, Line 10: Xeon scalable processor, the timeout needs to be adjusted.
Are you saying that multi socket cpus coming online takes 10 secs? I don't see a reason aside from " […]
10 seconds is an extreme number. However, the original 1 second is definitely not enough. Also the number depends on platform design, such as how far it is from one socket to another socket, and how many sockets there are (8 socket servers are being worked on in the industry).
Hello Aaron Durbin, David Hendricks, build bot (Jenkins), Anjaneya "Reddy" Chagam,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/38546
to look at the new patch set (#7).
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
cpu/x86: Make MP init timeout configurable
The current MP init timeout is hardcoded as 1s. To support Intel Xeon scalable processor, the timeout needs to be adjusted.
This patch makes MP init timeout configurable.
Signed-off-by: Jonathan Zhang jonzhang@fb.com Signed-off-by: Reddy Chagam anjaneya.chagam@intel.com Change-Id: Ibc079fc6aa8641d4ac8d8e726899b6c8d055052e --- M src/cpu/x86/Kconfig M src/cpu/x86/mp_init.c 2 files changed, 8 insertions(+), 2 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/46/38546/7
Hello Aaron Durbin, David Hendricks, build bot (Jenkins), Anjaneya "Reddy" Chagam,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/38546
to look at the new patch set (#8).
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
cpu/x86: Make MP init timeout configurable
The current MP init timeout is hardcoded as 1s. To support Intel Xeon scalable processor, the timeout needs to be adjusted.
This patch makes MP init timeout configurable.
Signed-off-by: Jonathan Zhang jonzhang@fb.com Signed-off-by: Reddy Chagam anjaneya.chagam@intel.com Change-Id: Ibc079fc6aa8641d4ac8d8e726899b6c8d055052e --- M src/cpu/x86/Kconfig M src/cpu/x86/mp_init.c 2 files changed, 8 insertions(+), 2 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/46/38546/8
David Hendricks has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 8:
(1 comment)
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c@525 PS1, Line 525: const int timeout_us = CONFIG_FLIGHT_PLAN_TIMEOUT_US;
It is oaky for me to simply increase the timeout for all targets, if this is fine for the community. […]
We have num_aps defined below, maybe we can use that as a multiplier? So for example, 100ms per AP with a minimum of 1s: int num_aps = mp_params->num_cpus - 1; int timeout_us = min(1000000, 100000 * num_aps);
That removes the need for yet another Kconfig variable and is also consistent with what the comment above suggests about time potentially increasing due to contention of resources such as UART.
David Hendricks has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 8:
(1 comment)
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c@525 PS1, Line 525: const int timeout_us = CONFIG_FLIGHT_PLAN_TIMEOUT_US;
We have num_aps defined below, maybe we can use that as a multiplier? So for example, 100ms per AP w […]
and of course by min() I meant max() ;-)
Jonathan Zhang has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 8:
(1 comment)
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c@525 PS1, Line 525: const int timeout_us = CONFIG_FLIGHT_PLAN_TIMEOUT_US;
and of course by min() I meant max() ;-)
Thanks for the feedback!
Hello Aaron Durbin, David Hendricks, build bot (Jenkins), Anjaneya "Reddy" Chagam,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/38546
to look at the new patch set (#9).
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
cpu/x86: Make MP init timeout configurable
The current MP init timeout is hardcoded as 1s. To support platform with many cpus, the timeout needs to be adjusted. The number of cpus is: number of sockets * number of cores per socket * number of threads per core
How long the timeout should be set to, needs to be heuristic. It needs to be set long enough to ensure reboot stability.
This patch sets timeout to be minimum as 1 second, while each cpu adds 0.1 second.
Signed-off-by: Jonathan Zhang jonzhang@fb.com Signed-off-by: Reddy Chagam anjaneya.chagam@intel.com Change-Id: Ibc079fc6aa8641d4ac8d8e726899b6c8d055052e --- M src/cpu/x86/mp_init.c 1 file changed, 10 insertions(+), 2 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/46/38546/9
David Hendricks has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 9: Code-Review-1
(3 comments)
A few more nits :-)
https://review.coreboot.org/c/coreboot/+/38546/9/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/9/src/cpu/x86/mp_init.c@152 PS9, Line 152: static int max(int x, int y) Can you #include <stdlib.h> instead?
https://review.coreboot.org/c/coreboot/+/38546/9/src/cpu/x86/mp_init.c@528 PS9, Line 528: configurable value It's no longer configurable
https://review.coreboot.org/c/coreboot/+/38546/9/src/cpu/x86/mp_init.c@531 PS9, Line 531: It is set to at least 1 second, while each AP adds 0.1 second. The mp_params->num_cpus includes the NBSP. Maybe this whole comment should be re-written, I'm thinking something like:
Set time out for flight plan to a huge minimum value (>=1 second). CPUs with many APs may take longer if there is contention for resources such as UART, so scale the time out up by increments of 100ms if needed.
Nico Huber has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 9:
(1 comment)
https://review.coreboot.org/c/coreboot/+/38546/9//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/38546/9//COMMIT_MSG@15 PS9, Line 15: needs to be heuristic I must miss something here. On a system with only few threads, why would a smaller timeout be favorable?
I know there are places, even in firmware, for tight timeouts, e.g. when you want to probe for something optional and still boot fast or for responsiveness in a UI loop. But why here? AFAICS, if we time out here, the whole system runs into an undefined state. So it's not about boot times, is it?
Jonathan Zhang has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 9:
(4 comments)
https://review.coreboot.org/c/coreboot/+/38546/9//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/38546/9//COMMIT_MSG@15 PS9, Line 15: needs to be heuristic
I must miss something here. On a system with only few threads, […]
During the MP init (flight plan) process, the MSRs of each cpu (thread) need to be set up. When the core design gets more complex and when new features are added, it takes longer time. Moreover, the number of cpus increase very fast in recent years. On 72 cpus Tiogapass I am working on, most of the time 1s works well, but occasionally it is not enough (I do not have statistics as this is not current focus). If you add a single line of debug print in this process, almost every reboot times out. To guarantee reboot stability, a larger timeout is needed. Good question that why not always use a big timeout? For smaller system like notebook, 1 second is already a big number. I work on server instead of smaller system, so I cannot speak for them. I believe their automation system runs thousands of reboot tests in one hour, hence the timeout value is not desired to be too high.
https://review.coreboot.org/c/coreboot/+/38546/9/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/9/src/cpu/x86/mp_init.c@152 PS9, Line 152: static int max(int x, int y)
Can you #include <stdlib. […]
stdlib.h does not have support for max. I also tried.
https://review.coreboot.org/c/coreboot/+/38546/9/src/cpu/x86/mp_init.c@528 PS9, Line 528: configurable value
It's no longer configurable
Done
https://review.coreboot.org/c/coreboot/+/38546/9/src/cpu/x86/mp_init.c@531 PS9, Line 531: It is set to at least 1 second, while each AP adds 0.1 second.
The mp_params->num_cpus includes the NBSP. […]
Done
Hello Aaron Durbin, David Hendricks, build bot (Jenkins), Anjaneya "Reddy" Chagam,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/38546
to look at the new patch set (#10).
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
cpu/x86: Make MP init timeout configurable
The current MP init timeout is hardcoded as 1s. To support platform with many cpus, the timeout needs to be adjusted. The number of cpus is calculated as: number of sockets * number of cores per socket * number of threads per core
How long the timeout should be set to, is heuristic. It needs to be set long enough to ensure reboot stability, but not unreasonable.
This patch sets timeout to be minimum as 1 second, while each cpu adds 0.1 second.
Signed-off-by: Jonathan Zhang jonzhang@fb.com Signed-off-by: Reddy Chagam anjaneya.chagam@intel.com Change-Id: Ibc079fc6aa8641d4ac8d8e726899b6c8d055052e --- M src/cpu/x86/mp_init.c 1 file changed, 12 insertions(+), 4 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/46/38546/10
Angel Pons has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 10:
(1 comment)
https://review.coreboot.org/c/coreboot/+/38546/10/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/10/src/cpu/x86/mp_init.c@152 PS10, Line 152: static int max(int x, int y) Don't we have one of these already?
Jonathan Zhang has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 10:
(1 comment)
https://review.coreboot.org/c/coreboot/+/38546/10/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/10/src/cpu/x86/mp_init.c@152 PS10, Line 152: static int max(int x, int y)
Don't we have one of these already?
Not in coreboot code base though, unless I miss something. In payloads/libpayload/curses/curses.priv.h, it is defined. And it is also coded in payloads/nvramcui/nvramcui.c
Andrey Petrov has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 10:
(1 comment)
https://review.coreboot.org/c/coreboot/+/38546/10/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/10/src/cpu/x86/mp_init.c@152 PS10, Line 152: static int max(int x, int y)
Not in coreboot code base though, unless I miss something. […]
#include <stddef.h> just use MAX macro
David Hendricks has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 10:
(1 comment)
https://review.coreboot.org/c/coreboot/+/38546/9//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/38546/9//COMMIT_MSG@15 PS9, Line 15: needs to be heuristic
During the MP init (flight plan) process, the MSRs of each cpu (thread) need to be set up. […]
It can also be desirable to return an error code sooner and handle the failure quickly, assuming it's possible to handle such an error by turning off APs or whatever. Particularly for systems with fewer cores/threads since they're likely to be in embedded/mobile systems where people still care about boot time and can't easily replace parts.
I don't think that failed APs is really an undefined state, but if it were then we should just forget about timeouts and loop indefinitely.
Angel Pons has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 10:
(1 comment)
https://review.coreboot.org/c/coreboot/+/38546/9//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/38546/9//COMMIT_MSG@15 PS9, Line 15: needs to be heuristic
It can also be desirable to return an error code sooner and handle the failure quickly, assuming it' […]
In the Tioga Pass case, is it that MP init takes longer than one second, or does some core get stalled? If it's just that waking so many cores takes more time, then it's not a big deal. We shouldn't be hitting the timeout in regular cases anyway.
Jonathan Zhang has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 10:
(2 comments)
Thanks for the review. I have changes ready but not able to send right now. The current tip is broken for my compiler, the change identified is not committed yet.
https://review.coreboot.org/c/coreboot/+/38546/9//COMMIT_MSG Commit Message:
https://review.coreboot.org/c/coreboot/+/38546/9//COMMIT_MSG@15 PS9, Line 15: needs to be heuristic
In the Tioga Pass case, is it that MP init takes longer than one second, or does some core get stall […]
Makes sense that failure handling in this case should be improved. There could be real failures (such as due to burnt out core or due to the inter-socket physical link getting damaged), so timeout as the last defense is necessary. For TiogaPass server I have (which has 72 cpus; 80 cpus TiogaPass is most common), in most cases 1s is good enough, it does fail sometimes; also if a single debug print is added, it fails almost all the time.
https://review.coreboot.org/c/coreboot/+/38546/10/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/10/src/cpu/x86/mp_init.c@152 PS10, Line 152: static int max(int x, int y)
#include <stddef.h> […]
Thank you!
Hello Aaron Durbin, David Hendricks, build bot (Jenkins), Anjaneya "Reddy" Chagam,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/38546
to look at the new patch set (#11).
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
cpu/x86: Make MP init timeout configurable
The current MP init timeout is hardcoded as 1s. To support platform with many cpus, the timeout needs to be adjusted. The number of cpus is calculated as: number of sockets * number of cores per socket * number of threads per core
How long the timeout should be set to, is heuristic. It needs to be set long enough to ensure reboot stability, but not unreasonable so that real failures can be detected soon enough, especially for smaller systems.
This patch sets timeout to be minimum as 1 second, while each cpu adds 0.1 second.
Signed-off-by: Jonathan Zhang jonzhang@fb.com Signed-off-by: Reddy Chagam anjaneya.chagam@intel.com Change-Id: Ibc079fc6aa8641d4ac8d8e726899b6c8d055052e --- M src/cpu/x86/mp_init.c 1 file changed, 6 insertions(+), 4 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/46/38546/11
David Hendricks has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 11: Code-Review+2
Looks good to me. Even though it's a small patch, please let it sit for 24 hours to let reviewers in other countries chime in one more time if needed.
Jonathan Zhang has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 11:
(1 comment)
Thanks for your review.
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c File src/cpu/x86/mp_init.c:
https://review.coreboot.org/c/coreboot/+/38546/1/src/cpu/x86/mp_init.c@523 PS1, Line 523: * increases (contention for resources like UART also increases).
Comment needs update.
Done
Patrick Georgi has submitted this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
cpu/x86: Make MP init timeout configurable
The current MP init timeout is hardcoded as 1s. To support platform with many cpus, the timeout needs to be adjusted. The number of cpus is calculated as: number of sockets * number of cores per socket * number of threads per core
How long the timeout should be set to, is heuristic. It needs to be set long enough to ensure reboot stability, but not unreasonable so that real failures can be detected soon enough, especially for smaller systems.
This patch sets timeout to be minimum as 1 second, while each cpu adds 0.1 second.
Signed-off-by: Jonathan Zhang jonzhang@fb.com Signed-off-by: Reddy Chagam anjaneya.chagam@intel.com Change-Id: Ibc079fc6aa8641d4ac8d8e726899b6c8d055052e Reviewed-on: https://review.coreboot.org/c/coreboot/+/38546 Tested-by: build bot (Jenkins) no-reply@coreboot.org Reviewed-by: David Hendricks david.hendricks@gmail.com --- M src/cpu/x86/mp_init.c 1 file changed, 6 insertions(+), 4 deletions(-)
Approvals: build bot (Jenkins): Verified David Hendricks: Looks good to me, approved
diff --git a/src/cpu/x86/mp_init.c b/src/cpu/x86/mp_init.c index 45776f8..b093be7 100644 --- a/src/cpu/x86/mp_init.c +++ b/src/cpu/x86/mp_init.c @@ -13,6 +13,7 @@ */
#include <console/console.h> +#include <stddef.h> #include <stdint.h> #include <string.h> #include <rmodule.h> @@ -518,11 +519,12 @@ int i; int ret = 0; /* - * Set time-out to wait for APs to a huge value (=1 second) since it - * could take a longer time for APs to check-in as the number of APs - * increases (contention for resources like UART also increases). + * Set time out for flight plan to a huge minimum value (>=1 second). + * CPUs with many APs may take longer if there is contention for + * resources such as UART, so scale the time out up by increments of + * 100ms if needed. */ - const int timeout_us = 1000000; + const int timeout_us = MAX(1000000, 100000 * mp_params->num_cpus); const int step_us = 100; int num_aps = mp_params->num_cpus - 1; struct stopwatch sw;
9elements QA has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/38546 )
Change subject: cpu/x86: Make MP init timeout configurable ......................................................................
Patch Set 12:
Automatic boot test returned (PASS/FAIL/TOTAL): 3/0/3 Emulation targets: EMULATION_QEMU_X86_Q35 using payload TianoCore : SUCCESS : https://lava.9esec.io/r/427 EMULATION_QEMU_X86_Q35 using payload SeaBIOS : SUCCESS : https://lava.9esec.io/r/426 EMULATION_QEMU_X86_I440FX using payload SeaBIOS : SUCCESS : https://lava.9esec.io/r/425
Please note: This test is under development and might not be accurate at all!