Werner Zeh has uploaded this change for review. ( https://review.coreboot.org/c/coreboot/+/31545
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
acpi: Sort the reported APIC-IDs in the MADT table
coreboot performs MP-Init in a parallel way. That leads to the fact that the order, in which the CPUs are woken up, can vary from boot to boot. The creation of the MADT table just parses the devicetree and takes the CPUs reported there as it is for creating the single local APIC entries. Therefore, The OS will see different order of CPUs. There are CPUs out there (like Apollo Lake for example) which have shared caches on core-level and if the order is random this can end up in assigning cores to different tasks or even OSes (in a virtual enviroinment) which uses the same cache. This in turn will produce performance penalties across these distributed tasks/OSes.
Though there is a way on discover the core- and cache-topology it will in the end be necessary to take the APIC-ID into account. To simplify it, one can achieve the same output by sorting the APIC-IDs in an ascending order. This will lead to the fact that CPUs that share a given cache will be reported right next to each other in the MADT.
Change-Id: Ida74f9f00a4e2a03107a2124014403de60462735 Signed-off-by: Werner Zeh werner.zeh@siemens.com --- M src/arch/x86/acpi.c 1 file changed, 8 insertions(+), 4 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/45/31545/1
diff --git a/src/arch/x86/acpi.c b/src/arch/x86/acpi.c index 0c85d3a..c7b9214 100644 --- a/src/arch/x86/acpi.c +++ b/src/arch/x86/acpi.c @@ -8,7 +8,7 @@ * Copyright (C) 2005-2009 coresystems GmbH * Copyright (C) 2015 Timothy Pearson tpearson@raptorengineeringinc.com, * Raptor Engineering - * Copyright (C) 2016-2017 Siemens AG + * Copyright (C) 2016-2019 Siemens AG * * ACPI FADT, FACS, and DSDT table support added by * Nick Barker nick.barker9@btinternet.com, and those portions @@ -47,6 +47,7 @@ #include <cpu/x86/lapic_def.h> #include <cpu/cpu.h> #include <cbfs.h> +#include <sort.h>
u8 acpi_checksum(u8 *table, u32 length) { @@ -148,7 +149,7 @@ unsigned long acpi_create_madt_lapics(unsigned long current) { struct device *cpu; - int index = 0; + int index, apic_ids[CONFIG_MAX_CPUS], num_cpus = 0;
for (cpu = all_devices; cpu; cpu = cpu->next) { if ((cpu->path.type != DEVICE_PATH_APIC) || @@ -157,9 +158,12 @@ } if (!cpu->enabled) continue; + apic_ids[num_cpus++] = cpu->path.apic.apic_id; + } + bubblesort(apic_ids, num_cpus, NUM_ASCENDING); + for (index = 0; index < num_cpus; index++) { current += acpi_create_madt_lapic((acpi_madt_lapic_t *)current, - index, cpu->path.apic.apic_id); - index++; + index, apic_ids[index]); }
return current;
Hello Aaron Durbin, ron minnich, build bot (Jenkins), Patrick Georgi,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/31545
to look at the new patch set (#3).
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
acpi: Sort the reported APIC-IDs in the MADT table
coreboot performs MP-Init in a parallel way. That leads to the fact that the order, in which the CPUs are woken up, can vary from boot to boot. The creation of the MADT table just parses the devicetree and takes the CPUs reported there as it is for creating the single local APIC entries. Therefore, The OS will see different order of CPUs. There are CPUs out there (like Apollo Lake for example) which have shared caches on core-level and if the order is random this can end up in assigning cores to different tasks or even OSes (in a virtual enviroinment) which uses the same cache. This in turn will produce performance penalties across these distributed tasks/OSes.
Though there is a way on discover the core- and cache-topology it will in the end be necessary to take the APIC-ID into account. To simplify it, one can achieve the same output by sorting the APIC-IDs in an ascending order. This will lead to the fact that CPUs that share a given cache will be reported right next to each other in the MADT.
Change-Id: Ida74f9f00a4e2a03107a2124014403de60462735 Signed-off-by: Werner Zeh werner.zeh@siemens.com --- M src/arch/x86/acpi.c 1 file changed, 8 insertions(+), 4 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/45/31545/3
Paul Menzel has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/31545 )
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
Patch Set 3:
(1 comment)
Does this have a time penalty?
https://review.coreboot.org/#/c/31545/3//COMMIT_MSG Commit Message:
https://review.coreboot.org/#/c/31545/3//COMMIT_MSG@17 PS3, Line 17: enviroinment environment
Angel Pons has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/31545 )
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
Patch Set 3:
Patch Set 3:
(1 comment)
Does this have a time penalty?
I am pretty sure it is the one the sorting algorithm has, and it surely depends on the number of CPUs.
Hello Aaron Durbin, ron minnich, build bot (Jenkins), Patrick Georgi,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/31545
to look at the new patch set (#4).
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
acpi: Sort the reported APIC-IDs in the MADT table
coreboot performs MP-Init in a parallel way. That leads to the fact that the order, in which the CPUs are woken up, can vary from boot to boot. The creation of the MADT table just parses the devicetree and takes the CPUs reported there as it is for creating the single local APIC entries. Therefore, the OS will see different order of CPUs. There are CPUs out there (like Apollo Lake for example) which have shared caches on core-level and if the order is random this can end up in assigning cores to different tasks or even OSes (in a virtual environment) which uses the same cache. This in turn will produce performance penalties across these distributed tasks/OSes.
Though there is a way to discover the core- and cache-topology it will in the end be necessary to take the APIC-ID into account. To simplify it, one can achieve the same output by sorting the APIC-IDs in an ascending order. This will lead to the fact that CPUs that share a given cache will be reported right next to each other in the MADT.
Change-Id: Ida74f9f00a4e2a03107a2124014403de60462735 Signed-off-by: Werner Zeh werner.zeh@siemens.com --- M src/arch/x86/acpi.c 1 file changed, 8 insertions(+), 4 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/45/31545/4
Patrick Georgi has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/31545 )
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
Patch Set 4: Code-Review+1
(1 comment)
Patch Set 3:
Patch Set 3:
(1 comment)
Does this have a time penalty?
I am pretty sure it is the one the sorting algorithm has, and it surely depends on the number of CPUs.
It's bound by CONFIG_MAX_CPUS which (AFAICS) is at most 48 in our tree, in serengeti_cheetah_fam10. That would mean ~2500 swaps at most in highly localized code and data.
I guess we can deal with that when it becomes an issue? I have the suspicion that using timsort (another stable sort) won't fare much better on our data set because of setup and copying costs (whereas bubble sort works in-place).
https://review.coreboot.org/#/c/31545/4/src/arch/x86/acpi.c File src/arch/x86/acpi.c:
https://review.coreboot.org/#/c/31545/4/src/arch/x86/acpi.c@162 PS4, Line 162: apic_ids add a bounds check for the array. It _should_ never happen, but...
Werner Zeh has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/31545 )
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
Patch Set 4:
(1 comment)
Patch Set 4: Code-Review+1
(1 comment)
Patch Set 3:
Patch Set 3:
(1 comment)
Does this have a time penalty?
I am pretty sure it is the one the sorting algorithm has, and it surely depends on the number of CPUs.
It's bound by CONFIG_MAX_CPUS which (AFAICS) is at most 48 in our tree, in serengeti_cheetah_fam10. That would mean ~2500 swaps at most in highly localized code and data.
I guess we can deal with that when it becomes an issue? I have the suspicion that using timsort (another stable sort) won't fare much better on our data set because of setup and copying costs (whereas bubble sort works in-place).
Yes, it highliy depends on CONFIG_MAX_CPUS. Even if we hit the worst case (O(n²)) this would mean 2304 swaps. I gues on modern CPUs this should not be an issue and we can deal with it once it will become an issue. The benefit of having bubble sort is as you mentioned that it works in-place. No need to copy data while sorting. And with the added abort condition real-life runtime should't be that bad.
https://review.coreboot.org/#/c/31545/4/src/arch/x86/acpi.c File src/arch/x86/acpi.c:
https://review.coreboot.org/#/c/31545/4/src/arch/x86/acpi.c@162 PS4, Line 162: apic_ids
add a bounds check for the array. It _should_ never happen, but...
There is a bounds check already in the MP-Init code (the one I have looked into) when the cpu entries are generated. But there might be cases where it is missing. I will add a check.
Hello Aaron Durbin, ron minnich, build bot (Jenkins), Patrick Georgi,
I'd like you to reexamine a change. Please visit
https://review.coreboot.org/c/coreboot/+/31545
to look at the new patch set (#5).
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
acpi: Sort the reported APIC-IDs in the MADT table
coreboot performs MP-Init in a parallel way. That leads to the fact that the order, in which the CPUs are woken up, can vary from boot to boot. The creation of the MADT table just parses the devicetree and takes the CPUs reported there as it is for creating the single local APIC entries. Therefore, the OS will see different order of CPUs. There are CPUs out there (like Apollo Lake for example) which have shared caches on core-level and if the order is random this can end up in assigning cores to different tasks or even OSes (in a virtual environment) which uses the same cache. This in turn will produce performance penalties across these distributed tasks/OSes.
Though there is a way to discover the core- and cache-topology it will in the end be necessary to take the APIC-ID into account. To simplify it, one can achieve the same output by sorting the APIC-IDs in an ascending order. This will lead to the fact that CPUs that share a given cache will be reported right next to each other in the MADT.
Change-Id: Ida74f9f00a4e2a03107a2124014403de60462735 Signed-off-by: Werner Zeh werner.zeh@siemens.com --- M src/arch/x86/acpi.c 1 file changed, 10 insertions(+), 4 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/45/31545/5
Patrick Georgi has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/31545 )
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
Patch Set 5: Code-Review+2
Paul Menzel has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/31545 )
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
Patch Set 5: Code-Review+1
Patch Set 4:
Does this have a time penalty?
I am pretty sure it is the one the sorting algorithm has, and it surely depends on the number of CPUs.
It's bound by CONFIG_MAX_CPUS which (AFAICS) is at most 48 in our tree, in serengeti_cheetah_fam10. That would mean ~2500 swaps at most in highly localized code and data.
I guess we can deal with that when it becomes an issue? I have the suspicion that using timsort (another stable sort) won't fare much better on our data set because of setup and copying costs (whereas bubble sort works in-place).
Yes, it highliy depends on CONFIG_MAX_CPUS. Even if we hit the worst case (O(n²)) this would mean 2304 swaps. I gues on modern CPUs this should not be an issue and we can deal with it once it will become an issue. The benefit of having bubble sort is as you mentioned that it works in-place. No need to copy data while sorting. And with the added abort condition real-life runtime should't be that bad.
Thank you for your responses, and sorry for being unclear. I just wanted to know, if it is noticeable in the output of `cbmem -t`.
Patrick Georgi has submitted this change and it was merged. ( https://review.coreboot.org/c/coreboot/+/31545 )
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
acpi: Sort the reported APIC-IDs in the MADT table
coreboot performs MP-Init in a parallel way. That leads to the fact that the order, in which the CPUs are woken up, can vary from boot to boot. The creation of the MADT table just parses the devicetree and takes the CPUs reported there as it is for creating the single local APIC entries. Therefore, the OS will see different order of CPUs. There are CPUs out there (like Apollo Lake for example) which have shared caches on core-level and if the order is random this can end up in assigning cores to different tasks or even OSes (in a virtual environment) which uses the same cache. This in turn will produce performance penalties across these distributed tasks/OSes.
Though there is a way to discover the core- and cache-topology it will in the end be necessary to take the APIC-ID into account. To simplify it, one can achieve the same output by sorting the APIC-IDs in an ascending order. This will lead to the fact that CPUs that share a given cache will be reported right next to each other in the MADT.
Change-Id: Ida74f9f00a4e2a03107a2124014403de60462735 Signed-off-by: Werner Zeh werner.zeh@siemens.com Reviewed-on: https://review.coreboot.org/c/31545 Tested-by: build bot (Jenkins) no-reply@coreboot.org Reviewed-by: Patrick Georgi pgeorgi@google.com Reviewed-by: Paul Menzel paulepanter@users.sourceforge.net --- M src/arch/x86/acpi.c 1 file changed, 10 insertions(+), 4 deletions(-)
Approvals: build bot (Jenkins): Verified Patrick Georgi: Looks good to me, approved Paul Menzel: Looks good to me, but someone else must approve
diff --git a/src/arch/x86/acpi.c b/src/arch/x86/acpi.c index f4c7d86..e06c3ad 100644 --- a/src/arch/x86/acpi.c +++ b/src/arch/x86/acpi.c @@ -8,7 +8,7 @@ * Copyright (C) 2005-2009 coresystems GmbH * Copyright (C) 2015 Timothy Pearson tpearson@raptorengineeringinc.com, * Raptor Engineering - * Copyright (C) 2016-2017 Siemens AG + * Copyright (C) 2016-2019 Siemens AG * * ACPI FADT, FACS, and DSDT table support added by * Nick Barker nick.barker9@btinternet.com, and those portions @@ -48,6 +48,7 @@ #include <cpu/cpu.h> #include <cbfs.h> #include <version.h> +#include <commonlib/sort.h>
u8 acpi_checksum(u8 *table, u32 length) { @@ -149,7 +150,7 @@ unsigned long acpi_create_madt_lapics(unsigned long current) { struct device *cpu; - int index = 0; + int index, apic_ids[CONFIG_MAX_CPUS], num_cpus = 0;
for (cpu = all_devices; cpu; cpu = cpu->next) { if ((cpu->path.type != DEVICE_PATH_APIC) || @@ -158,9 +159,14 @@ } if (!cpu->enabled) continue; + if (num_cpus >= ARRAY_SIZE(apic_ids)) + break; + apic_ids[num_cpus++] = cpu->path.apic.apic_id; + } + bubblesort(apic_ids, num_cpus, NUM_ASCENDING); + for (index = 0; index < num_cpus; index++) { current += acpi_create_madt_lapic((acpi_madt_lapic_t *)current, - index, cpu->path.apic.apic_id); - index++; + index, apic_ids[index]); }
return current;
Arthur Heymans has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/31545 )
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
Patch Set 7:
(1 comment)
https://review.coreboot.org/#/c/31545/7/src/arch/x86/acpi.c File src/arch/x86/acpi.c:
https://review.coreboot.org/#/c/31545/7/src/arch/x86/acpi.c@166 PS7, Line 166: num_cpus check if > 0. Some older compilers (gcc6) don't like this potentially being 0.
Werner Zeh has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/31545 )
Change subject: acpi: Sort the reported APIC-IDs in the MADT table ......................................................................
Patch Set 7:
(1 comment)
https://review.coreboot.org/#/c/31545/7/src/arch/x86/acpi.c File src/arch/x86/acpi.c:
https://review.coreboot.org/#/c/31545/7/src/arch/x86/acpi.c@166 PS7, Line 166: num_cpus
check if > 0. Some older compilers (gcc6) don't like this potentially being 0.
See https://review.coreboot.org/c/coreboot/+/31853