- Halt if number of started cpus are more then expected - Allow bios to detect if qemu has cpus status map for acpi hotplug - Take in account hot(un)plugged cpus on reboot
Related qemu patches I'll post to qemu-devel
Signed-off-by: Igor Mammedov imammedo@redhat.com
Reduce amount of consumed cpu time (i.e. don't spin forever) if number of started cpus are more then expected. And print a bug message into debug port.
Signed-off-by: Igor Mammedov imammedo@redhat.com --- src/smp.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/src/smp.c b/src/smp.c index 8c077a1..9933ac6 100644 --- a/src/smp.c +++ b/src/smp.c @@ -115,8 +115,14 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT); - while (cmos_smp_count + 1 != readl(&CountCPUs)) + while (cmos_smp_count + 1 != readl(&CountCPUs)) { + if (cmos_smp_count + 1 < readl(&CountCPUs)) { + dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n", + cmos_smp_count + 1, readl(&CountCPUs)); + hlt(); + } yield(); + } }
// Restore memory.
Signed-off-by: Igor Mammedov imammedo@redhat.com --- src/paravirt.c | 12 ++++++++++++ src/paravirt.h | 2 ++ 2 files changed, 14 insertions(+), 0 deletions(-)
diff --git a/src/paravirt.c b/src/paravirt.c index 9cf77de..c2bd0a8 100644 --- a/src/paravirt.c +++ b/src/paravirt.c @@ -305,6 +305,18 @@ u16 qemu_cfg_get_max_cpus(void) return cnt; }
+u16 qemu_cfg_have_acpi_cpus_map(void) +{ + u16 cnt; + + if (!qemu_cfg_present) + return 0; + + qemu_cfg_read_entry(&cnt, QEMU_CFG_HAVE_ACPI_CPUS_MAP, sizeof(cnt)); + + return cnt; +} + static QemuCfgFile LastFile;
static u32 diff --git a/src/paravirt.h b/src/paravirt.h index 9674089..df69f49 100644 --- a/src/paravirt.h +++ b/src/paravirt.h @@ -35,6 +35,7 @@ static inline int kvm_para_available(void) #define QEMU_CFG_BOOT_MENU 0x0e #define QEMU_CFG_MAX_CPUS 0x0f #define QEMU_CFG_FILE_DIR 0x19 +#define QEMU_CFG_HAVE_ACPI_CPUS_MAP 0x20 #define QEMU_CFG_ARCH_LOCAL 0x8000 #define QEMU_CFG_ACPI_TABLES (QEMU_CFG_ARCH_LOCAL + 0) #define QEMU_CFG_SMBIOS_ENTRIES (QEMU_CFG_ARCH_LOCAL + 1) @@ -57,6 +58,7 @@ int qemu_cfg_smbios_load_external(int type, char **p, unsigned *nr_structs, int qemu_cfg_get_numa_nodes(void); void qemu_cfg_get_numa_data(u64 *data, int n); u16 qemu_cfg_get_max_cpus(void); +u16 qemu_cfg_have_acpi_cpus_map(void);
typedef struct QemuCfgFile { u32 size; /* file size */
Initial count of active cpus is communicated to bios from qemu via CMOS_BIOS_SMP_COUNT io port. However if cpus are hotplugged after boot and then guest is rebooted without taking down qemu then bios might be stuck at smp_probe while (cmos_smp_count + 1 != readl(&CountCPUs)) yield(); where cmos_smp_count + 1 will be less that CountCPUs due to additional hotplugged cpus.
One way to fix bug is to take in account hotplugged cpus and count online cpus in cpu status bitmap that qemu provides at 0xaf00 and bios uses for ACPI cpu hotplug in acpi-dsdt.dsl.
Alternative ways to fix issue was disscussed on following thread: http://www.seabios.org/pipermail/seabios/2011-August/002147.html without any conclusion
Rationale why counting cpus in cpu_sts bitmap at 0xaf00 might be better that updating CMOS_BIOS_SMP_COUNT in qemu. feeble one: we are already relying on cpu_sts at 0xaf00 for ACPI cpu hotplug machinery and it seems that there is no standard way to pass this info (so we are now using "board extension"). 2nd: another possible use for cpu_sts bitmap is when cpus are able to be (hot)plugged in nonconsecutive order, MADT + CPON package should be build taking in account info about which cpus are (un)plugged.
v2 changes: - access cpu_sts only if qemu advertise its support - unconditionally use acpi_cpu_online_count if available to cover unplug case as well
Signed-off-by: Igor Mammedov imammedo@redhat.com --- src/smp.c | 23 +++++++++++++++++++++++ 1 files changed, 23 insertions(+), 0 deletions(-)
diff --git a/src/smp.c b/src/smp.c index 9933ac6..835b766 100644 --- a/src/smp.c +++ b/src/smp.c @@ -17,6 +17,9 @@
#define APIC_ENABLED 0x0100
+#define ACPI_CPU_STATUS_MAP 0xaf00 +#define ACPI_CPU_STATUS_MAP_SZ 32 + struct { u32 ecx, eax, edx; } smp_mtrr[32] VAR16VISIBLE; u32 smp_mtrr_count VAR16VISIBLE;
@@ -115,6 +118,26 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT); + dprintf(1, "Powered-on with %d cpu(s)\n", cmos_smp_count + 1); + + if (qemu_cfg_have_acpi_cpus_map()) { + u8 i = 0, acpi_cpu_online_count = 0; + /* count plugged in cpus in acpi PRST bitmap */ + while (i < ACPI_CPU_STATUS_MAP_SZ) { + u8 j = 0, status = inb(ACPI_CPU_STATUS_MAP + (i++)); + while (j < 8) + if ((status >> j++) & 1) + ++acpi_cpu_online_count; + } + + dprintf(1, "Counted %d present cpu(s) in ACPI cpus status map\n", + acpi_cpu_online_count); + /* if cpu(s) were hot(un)plugged then on reboot + * we should wait for an actual cpus number, so use + * acpi_cpu_online_count instead of cmos_smp_count */ + cmos_smp_count = acpi_cpu_online_count - 1; + } + while (cmos_smp_count + 1 != readl(&CountCPUs)) { if (cmos_smp_count + 1 < readl(&CountCPUs)) { dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n",
On Sat, Mar 10, 2012 at 12:47:26PM +0100, Igor Mammedov wrote:
Reduce amount of consumed cpu time (i.e. don't spin forever) if number of started cpus are more then expected. And print a bug message into debug port.
Signed-off-by: Igor Mammedov imammedo@redhat.com
src/smp.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/src/smp.c b/src/smp.c index 8c077a1..9933ac6 100644 --- a/src/smp.c +++ b/src/smp.c @@ -115,8 +115,14 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
while (cmos_smp_count + 1 != readl(&CountCPUs))
while (cmos_smp_count + 1 != readl(&CountCPUs)) {
if (cmos_smp_count + 1 < readl(&CountCPUs)) {
dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n",
cmos_smp_count + 1, readl(&CountCPUs));
Shouldn't we print it to the console too?
hlt();
} yield();
}
}
// Restore memory.
-- 1.7.7.6
-- Gleb.
On Sat, Mar 10, 2012 at 12:47:28PM +0100, Igor Mammedov wrote:
Initial count of active cpus is communicated to bios from qemu via CMOS_BIOS_SMP_COUNT io port. However if cpus are hotplugged after boot and then guest is rebooted without taking down qemu then bios might be stuck at smp_probe while (cmos_smp_count + 1 != readl(&CountCPUs)) yield(); where cmos_smp_count + 1 will be less that CountCPUs due to additional hotplugged cpus.
One way to fix bug is to take in account hotplugged cpus and count online cpus in cpu status bitmap that qemu provides at 0xaf00 and bios uses for ACPI cpu hotplug in acpi-dsdt.dsl.
Alternative ways to fix issue was disscussed on following thread: http://www.seabios.org/pipermail/seabios/2011-August/002147.html without any conclusion
Rationale why counting cpus in cpu_sts bitmap at 0xaf00 might be better that updating CMOS_BIOS_SMP_COUNT in qemu. feeble one: we are already relying on cpu_sts at 0xaf00 for ACPI cpu hotplug machinery and it seems that there is no standard way to pass this info (so we are now using "board extension"). 2nd: another possible use for cpu_sts bitmap is when cpus are able to be (hot)plugged in nonconsecutive order, MADT + CPON package should be build taking in account info about which cpus are (un)plugged.
v2 changes:
- access cpu_sts only if qemu advertise its support
- unconditionally use acpi_cpu_online_count if available to cover unplug case as well
Why not update cmos in qemu during cpu hot plug instead?
Signed-off-by: Igor Mammedov imammedo@redhat.com
src/smp.c | 23 +++++++++++++++++++++++ 1 files changed, 23 insertions(+), 0 deletions(-)
diff --git a/src/smp.c b/src/smp.c index 9933ac6..835b766 100644 --- a/src/smp.c +++ b/src/smp.c @@ -17,6 +17,9 @@
#define APIC_ENABLED 0x0100
+#define ACPI_CPU_STATUS_MAP 0xaf00 +#define ACPI_CPU_STATUS_MAP_SZ 32
struct { u32 ecx, eax, edx; } smp_mtrr[32] VAR16VISIBLE; u32 smp_mtrr_count VAR16VISIBLE;
@@ -115,6 +118,26 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
dprintf(1, "Powered-on with %d cpu(s)\n", cmos_smp_count + 1);
if (qemu_cfg_have_acpi_cpus_map()) {
u8 i = 0, acpi_cpu_online_count = 0;
/* count plugged in cpus in acpi PRST bitmap */
while (i < ACPI_CPU_STATUS_MAP_SZ) {
u8 j = 0, status = inb(ACPI_CPU_STATUS_MAP + (i++));
while (j < 8)
if ((status >> j++) & 1)
++acpi_cpu_online_count;
}
dprintf(1, "Counted %d present cpu(s) in ACPI cpus status map\n",
acpi_cpu_online_count);
/* if cpu(s) were hot(un)plugged then on reboot
* we should wait for an actual cpus number, so use
* acpi_cpu_online_count instead of cmos_smp_count */
cmos_smp_count = acpi_cpu_online_count - 1;
}
while (cmos_smp_count + 1 != readl(&CountCPUs)) { if (cmos_smp_count + 1 < readl(&CountCPUs)) { dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n",
-- 1.7.7.6
-- Gleb.
On 03/11/2012 11:36 AM, Gleb Natapov wrote:
On Sat, Mar 10, 2012 at 12:47:26PM +0100, Igor Mammedov wrote:
Reduce amount of consumed cpu time (i.e. don't spin forever) if number of started cpus are more then expected. And print a bug message into debug port.
Signed-off-by: Igor Mammedovimammedo@redhat.com
src/smp.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/src/smp.c b/src/smp.c index 8c077a1..9933ac6 100644 --- a/src/smp.c +++ b/src/smp.c @@ -115,8 +115,14 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
while (cmos_smp_count + 1 != readl(&CountCPUs))
while (cmos_smp_count + 1 != readl(&CountCPUs)) {
if (cmos_smp_count + 1< readl(&CountCPUs)) {
dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n",
cmos_smp_count + 1, readl(&CountCPUs));
Shouldn't we print it to the console too?
I'm not sure if it's possible to print at this stage, but I'll try and re-post if it works.
hlt();
} yield();
} } // Restore memory.
-- 1.7.7.6
-- Gleb.
On Mon, Mar 12, 2012 at 10:52:18AM +0100, Igor Mammedov wrote:
On 03/11/2012 11:36 AM, Gleb Natapov wrote:
On Sat, Mar 10, 2012 at 12:47:26PM +0100, Igor Mammedov wrote:
Reduce amount of consumed cpu time (i.e. don't spin forever) if number of started cpus are more then expected. And print a bug message into debug port.
Signed-off-by: Igor Mammedovimammedo@redhat.com
src/smp.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/src/smp.c b/src/smp.c index 8c077a1..9933ac6 100644 --- a/src/smp.c +++ b/src/smp.c @@ -115,8 +115,14 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
while (cmos_smp_count + 1 != readl(&CountCPUs))
while (cmos_smp_count + 1 != readl(&CountCPUs)) {
if (cmos_smp_count + 1< readl(&CountCPUs)) {
dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n",
cmos_smp_count + 1, readl(&CountCPUs));
Shouldn't we print it to the console too?
I'm not sure if it's possible to print at this stage, but I'll try and re-post if it works.
If it is not possible currently may be we should move smp detection to later stage. Or do not hlt here and report error later. The easier for a user to see this error the easier it is for us to debug user's problems.
hlt();
} yield();
}
}
// Restore memory.
-- 1.7.7.6
-- Gleb.
--
Igor
-- Gleb.
On 03/12/2012 11:14 AM, Gleb Natapov wrote:
On Mon, Mar 12, 2012 at 10:52:18AM +0100, Igor Mammedov wrote:
On 03/11/2012 11:36 AM, Gleb Natapov wrote:
On Sat, Mar 10, 2012 at 12:47:26PM +0100, Igor Mammedov wrote:
Reduce amount of consumed cpu time (i.e. don't spin forever) if number of started cpus are more then expected. And print a bug message into debug port.
Signed-off-by: Igor Mammedovimammedo@redhat.com
src/smp.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/src/smp.c b/src/smp.c index 8c077a1..9933ac6 100644 --- a/src/smp.c +++ b/src/smp.c @@ -115,8 +115,14 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
while (cmos_smp_count + 1 != readl(&CountCPUs))
while (cmos_smp_count + 1 != readl(&CountCPUs)) {
if (cmos_smp_count + 1< readl(&CountCPUs)) {
dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n",
cmos_smp_count + 1, readl(&CountCPUs));
Shouldn't we print it to the console too?
I'm not sure if it's possible to print at this stage, but I'll try and re-post if it works.
If it is not possible currently may be we should move smp detection to later stage. Or do not hlt here and report error later. The easier for a user to see this error the easier it is for us to debug user's problems.
It looks safe to move vga_setup() right after pci_setup() in post(). However I can't tell what structures various option roms might access so I'm not completely sure if it right thing to do.
Does something like this look good to you? (kvm guest boots without any issue and error prints as well if condition is met)
diff --git a/src/post.c b/src/post.c index 43523dd..9358b49 100644 --- a/src/post.c +++ b/src/post.c @@ -208,6 +208,9 @@ post(void) pci_setup(); smm_init();
+ // Run vga option rom + vga_setup(); + // Initialize internal tables boot_setup(); drive_setup(); @@ -227,9 +230,6 @@ post(void) mouse_setup(); init_bios_tables();
- // Run vga option rom - vga_setup(); - // Do hardware initialization (if running synchronously) if (!CONFIG_THREADS || !CONFIG_THREAD_OPTIONROMS) { init_hw(); diff --git a/src/smp.c b/src/smp.c index 42184a4..b11cb95 100644 --- a/src/smp.c +++ b/src/smp.c @@ -138,8 +138,9 @@ smp_probe(void)
while (cmos_smp_count + 1 != readl(&CountCPUs)) { if (cmos_smp_count + 1 < readl(&CountCPUs)) { - dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n", + printf("ERROR: Expected %d cpu(s) but %d cpus started.\n", cmos_smp_count + 1, readl(&CountCPUs)); + printf("Machine halted.\n"); hlt(); } yield();
On Mon, Mar 12, 2012 at 02:09:02PM +0100, Igor Mammedov wrote:
On 03/12/2012 11:14 AM, Gleb Natapov wrote:
On Mon, Mar 12, 2012 at 10:52:18AM +0100, Igor Mammedov wrote:
On 03/11/2012 11:36 AM, Gleb Natapov wrote:
On Sat, Mar 10, 2012 at 12:47:26PM +0100, Igor Mammedov wrote:
Reduce amount of consumed cpu time (i.e. don't spin forever) if number of started cpus are more then expected. And print a bug message into debug port.
Signed-off-by: Igor Mammedovimammedo@redhat.com
src/smp.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/src/smp.c b/src/smp.c index 8c077a1..9933ac6 100644 --- a/src/smp.c +++ b/src/smp.c @@ -115,8 +115,14 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
while (cmos_smp_count + 1 != readl(&CountCPUs))
while (cmos_smp_count + 1 != readl(&CountCPUs)) {
if (cmos_smp_count + 1< readl(&CountCPUs)) {
dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n",
cmos_smp_count + 1, readl(&CountCPUs));
Shouldn't we print it to the console too?
I'm not sure if it's possible to print at this stage, but I'll try and re-post if it works.
If it is not possible currently may be we should move smp detection to later stage. Or do not hlt here and report error later. The easier for a user to see this error the easier it is for us to debug user's problems.
It looks safe to move vga_setup() right after pci_setup() in post(). However I can't tell what structures various option roms might access so I'm not completely sure if it right thing to do.
Well you move it above "Setup interfaces that option roms may need" comment, so we can be sure it is not safe :)
I think moving smp detection to later stage should be safer. But now I am not even sure we should halt at all. May be warn the user and continue booting with cmos_smp_count = CountCPUs?
Does something like this look good to you? (kvm guest boots without any issue and error prints as well if condition is met)
diff --git a/src/post.c b/src/post.c index 43523dd..9358b49 100644 --- a/src/post.c +++ b/src/post.c @@ -208,6 +208,9 @@ post(void) pci_setup(); smm_init();
- // Run vga option rom
- vga_setup();
- // Initialize internal tables boot_setup(); drive_setup();
@@ -227,9 +230,6 @@ post(void) mouse_setup(); init_bios_tables();
- // Run vga option rom
- vga_setup();
- // Do hardware initialization (if running synchronously) if (!CONFIG_THREADS || !CONFIG_THREAD_OPTIONROMS) { init_hw();
diff --git a/src/smp.c b/src/smp.c index 42184a4..b11cb95 100644 --- a/src/smp.c +++ b/src/smp.c @@ -138,8 +138,9 @@ smp_probe(void)
while (cmos_smp_count + 1 != readl(&CountCPUs)) { if (cmos_smp_count + 1 < readl(&CountCPUs)) {
dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n",
printf("ERROR: Expected %d cpu(s) but %d cpus started.\n", cmos_smp_count + 1, readl(&CountCPUs));
printf("Machine halted.\n"); hlt(); } yield();
--
Igor
-- Gleb.
On 03/12/2012 02:27 PM, Gleb Natapov wrote:
On Mon, Mar 12, 2012 at 02:09:02PM +0100, Igor Mammedov wrote:
On 03/12/2012 11:14 AM, Gleb Natapov wrote:
On Mon, Mar 12, 2012 at 10:52:18AM +0100, Igor Mammedov wrote:
On 03/11/2012 11:36 AM, Gleb Natapov wrote:
On Sat, Mar 10, 2012 at 12:47:26PM +0100, Igor Mammedov wrote:
Reduce amount of consumed cpu time (i.e. don't spin forever) if number of started cpus are more then expected. And print a bug message into debug port.
Signed-off-by: Igor Mammedovimammedo@redhat.com
src/smp.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/src/smp.c b/src/smp.c index 8c077a1..9933ac6 100644 --- a/src/smp.c +++ b/src/smp.c @@ -115,8 +115,14 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
while (cmos_smp_count + 1 != readl(&CountCPUs))
while (cmos_smp_count + 1 != readl(&CountCPUs)) {
if (cmos_smp_count + 1< readl(&CountCPUs)) {
dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n",
cmos_smp_count + 1, readl(&CountCPUs));
Shouldn't we print it to the console too?
I'm not sure if it's possible to print at this stage, but I'll try and re-post if it works.
If it is not possible currently may be we should move smp detection to later stage. Or do not hlt here and report error later. The easier for a user to see this error the easier it is for us to debug user's problems.
It looks safe to move vga_setup() right after pci_setup() in post(). However I can't tell what structures various option roms might access so I'm not completely sure if it right thing to do.
Well you move it above "Setup interfaces that option roms may need" comment, so we can be sure it is not safe :)
Yep, sorry my bad.
I think moving smp detection to later stage should be safer. But now I am
we can't move it past init_bios_tables() and this function is is under comment "// Setup interfaces that option roms may need" so I'm not sure we can move vga_setup before it.
not even sure we should halt at all. May be warn the user and continue
We need to stop because otherwise user won't event notice that there was an error and only later may notice not expected cpus count in OS. BIOSes usually stop in case of error and you have to override this behavior in settings (something like this: boot if keyboard is not present).
Perhaps we could safely clamp CountCPUs to 1 + alter smp_ap_boot_code() to stop counting in case of error and continue to boot till we will be able to print error message and stop after that. Yep, error message is nice to have but I'm not sure if this one case is worth of inventing deferred print infrastructure. Support personnel might just enable logging from debug port if BIOS is suspected (it is easy to spot, we hang right on boot). That's what they have to do now in case of any BIOS related problems.
booting with cmos_smp_count = CountCPUs?
I guess is was this way some time ago 31bfad632b0ca and with large cpu counts it wasn't reliable to wait XX ms while all cpus initialize, that why cmos_smp_count was introduced and bios waits on exact count.
The halt condition is meant for buggy emulators so if we get more cpus then expected it is safer to just stop then experience inexplicable bugs later (I can imagine having mp, smbios and acpi tables built with different CountCPUs, and maybe there is other dependencies).
-- Gleb.
On Mon, Mar 12, 2012 at 05:04:48PM +0100, Igor Mammedov wrote:
On 03/12/2012 02:27 PM, Gleb Natapov wrote:
On Mon, Mar 12, 2012 at 02:09:02PM +0100, Igor Mammedov wrote:
On 03/12/2012 11:14 AM, Gleb Natapov wrote:
On Mon, Mar 12, 2012 at 10:52:18AM +0100, Igor Mammedov wrote:
On 03/11/2012 11:36 AM, Gleb Natapov wrote:
On Sat, Mar 10, 2012 at 12:47:26PM +0100, Igor Mammedov wrote: >Reduce amount of consumed cpu time (i.e. don't spin forever) if >number of started cpus are more then expected. And print a bug >message into debug port. > >Signed-off-by: Igor Mammedovimammedo@redhat.com >--- > src/smp.c | 8 +++++++- > 1 files changed, 7 insertions(+), 1 deletions(-) > >diff --git a/src/smp.c b/src/smp.c >index 8c077a1..9933ac6 100644 >--- a/src/smp.c >+++ b/src/smp.c >@@ -115,8 +115,14 @@ smp_probe(void) > msleep(10); > } else { > u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT); >- while (cmos_smp_count + 1 != readl(&CountCPUs)) >+ while (cmos_smp_count + 1 != readl(&CountCPUs)) { >+ if (cmos_smp_count + 1< readl(&CountCPUs)) { >+ dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n", >+ cmos_smp_count + 1, readl(&CountCPUs)); Shouldn't we print it to the console too?
I'm not sure if it's possible to print at this stage, but I'll try and re-post if it works.
If it is not possible currently may be we should move smp detection to later stage. Or do not hlt here and report error later. The easier for a user to see this error the easier it is for us to debug user's problems.
It looks safe to move vga_setup() right after pci_setup() in post(). However I can't tell what structures various option roms might access so I'm not completely sure if it right thing to do.
Well you move it above "Setup interfaces that option roms may need" comment, so we can be sure it is not safe :)
Yep, sorry my bad.
I think moving smp detection to later stage should be safer. But now I am
we can't move it past init_bios_tables() and this function is is under comment "// Setup interfaces that option roms may need" so I'm not sure we can move vga_setup before it.
I think init_bios_tables() is not required for option rom to work. Not sure though. Kevin?
not even sure we should halt at all. May be warn the user and continue
We need to stop because otherwise user won't event notice that there was an error and only later may notice not expected cpus count in OS. BIOSes usually stop in case of error and you have to override this behavior in settings (something like this: boot if keyboard is not present).
Perhaps we could safely clamp CountCPUs to 1 + alter smp_ap_boot_code() to stop counting in case of error and continue to boot till we will be able to print error message and stop after that. Yep, error message is nice to have but I'm not sure if this one case is worth of inventing deferred print infrastructure. Support personnel might just enable logging from debug port if BIOS is suspected (it is easy to spot, we hang right on boot). That's what they have to do now in case of any BIOS related problems.
Since this error should not really happen IRL I agree we should not spend to much time on new infrastructure.
booting with cmos_smp_count = CountCPUs?
I guess is was this way some time ago 31bfad632b0ca and with large cpu counts it wasn't reliable to wait XX ms while all cpus initialize, that why cmos_smp_count was introduced and bios waits on exact count.
The halt condition is meant for buggy emulators so if we get more cpus then expected it is safer to just stop then experience inexplicable bugs later (I can imagine having mp, smbios and acpi tables built with different CountCPUs, and maybe there is other dependencies).
-- Gleb.
--
Igor
-- Gleb.
On 03/11/2012 11:58 AM, Gleb Natapov wrote:
On Sat, Mar 10, 2012 at 12:47:28PM +0100, Igor Mammedov wrote:
Initial count of active cpus is communicated to bios from qemu via CMOS_BIOS_SMP_COUNT io port. However if cpus are hotplugged after boot and then guest is rebooted without taking down qemu then bios might be stuck at smp_probe while (cmos_smp_count + 1 != readl(&CountCPUs)) yield(); where cmos_smp_count + 1 will be less that CountCPUs due to additional hotplugged cpus.
One way to fix bug is to take in account hotplugged cpus and count online cpus in cpu status bitmap that qemu provides at 0xaf00 and bios uses for ACPI cpu hotplug in acpi-dsdt.dsl.
Alternative ways to fix issue was disscussed on following thread: http://www.seabios.org/pipermail/seabios/2011-August/002147.html without any conclusion
Rationale why counting cpus in cpu_sts bitmap at 0xaf00 might be better that updating CMOS_BIOS_SMP_COUNT in qemu. feeble one: we are already relying on cpu_sts at 0xaf00 for ACPI cpu hotplug machinery and it seems that there is no standard way to pass this info (so we are now using "board extension"). 2nd: another possible use for cpu_sts bitmap is when cpus are able to be (hot)plugged in nonconsecutive order, MADT + CPON package should be build taking in account info about which cpus are (un)plugged.
v2 changes:
- access cpu_sts only if qemu advertise its support
- unconditionally use acpi_cpu_online_count if available to cover unplug case as well
Why not update cmos in qemu during cpu hot plug instead?
In mail thread mentioned above it was proposed to do so but it didn't get any response: "
On 2011-08-03 12:07, Vasilis Liaskovitis wrote: An alternative to this patch would be to update the smp_cpus variable in qemu-kvm and do a "cmos update" to 0x5f from the cpu-hotplug code. Access to the rtc_state (cmos device) would be required in hw/acpi_piix4.c. This way no change to Seabios would be required.
"
Sure it' possible to do this in qemu but essentially it will duplicate count of cpus in cpu_sts. And if we are going to have some day ability to plug/unplug individual cpus then we need to have access to cpu_sts map to build correct mp/smbios/acpi tables. So why not to use cpu_sts for counting cpus if available and leave CMOS_BIOS_SMP_COUNT for compatibility reason. And later when (un)plugging specific cpus will be ready we could use the same map for correct initialization of various BIOS tables.
I've wrote this patch because of in perspective cpu_sts map can be used not only for counting cpus, otherwise I'd just try the way of '"cmos update" to 0x5f from the cpu-hotplug code' as the simplest way.
Signed-off-by: Igor Mammedovimammedo@redhat.com
src/smp.c | 23 +++++++++++++++++++++++ 1 files changed, 23 insertions(+), 0 deletions(-)
diff --git a/src/smp.c b/src/smp.c index 9933ac6..835b766 100644 --- a/src/smp.c +++ b/src/smp.c @@ -17,6 +17,9 @@
#define APIC_ENABLED 0x0100
+#define ACPI_CPU_STATUS_MAP 0xaf00 +#define ACPI_CPU_STATUS_MAP_SZ 32
- struct { u32 ecx, eax, edx; } smp_mtrr[32] VAR16VISIBLE; u32 smp_mtrr_count VAR16VISIBLE;
@@ -115,6 +118,26 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
dprintf(1, "Powered-on with %d cpu(s)\n", cmos_smp_count + 1);
if (qemu_cfg_have_acpi_cpus_map()) {
It's possible to encode version of map format, in fw_cfg value returned from qemu. So that in future if map format is changed then we'll be able to fallback to cmos_smp_count and use newer version when supported by BIOS.
u8 i = 0, acpi_cpu_online_count = 0;
/* count plugged in cpus in acpi PRST bitmap */
while (i< ACPI_CPU_STATUS_MAP_SZ) {
u8 j = 0, status = inb(ACPI_CPU_STATUS_MAP + (i++));
while (j< 8)
if ((status>> j++)& 1)
++acpi_cpu_online_count;
}
dprintf(1, "Counted %d present cpu(s) in ACPI cpus status map\n",
acpi_cpu_online_count);
/* if cpu(s) were hot(un)plugged then on reboot
* we should wait for an actual cpus number, so use
* acpi_cpu_online_count instead of cmos_smp_count */
cmos_smp_count = acpi_cpu_online_count - 1;
}
while (cmos_smp_count + 1 != readl(&CountCPUs)) { if (cmos_smp_count + 1< readl(&CountCPUs)) { dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n",
-- 1.7.7.6
-- Gleb.
On Sat, Mar 10, 2012 at 12:47:26PM +0100, Igor Mammedov wrote:
Reduce amount of consumed cpu time (i.e. don't spin forever) if number of started cpus are more then expected. And print a bug message into debug port.
Signed-off-by: Igor Mammedov imammedo@redhat.com
src/smp.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/src/smp.c b/src/smp.c index 8c077a1..9933ac6 100644 --- a/src/smp.c +++ b/src/smp.c @@ -115,8 +115,14 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
while (cmos_smp_count + 1 != readl(&CountCPUs))
while (cmos_smp_count + 1 != readl(&CountCPUs)) {
if (cmos_smp_count + 1 < readl(&CountCPUs)) {
dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n",
cmos_smp_count + 1, readl(&CountCPUs));
hlt();
} yield();
}
I suggest just changing the "while !=" to a "while <". The chance of detecting the error is pretty slim anyway. (Under normal circumstances, the processor is spinning on the count - the chance of two other processors getting in to increment before the first processor sees a change is pretty small.)
-Kevin
On Sat, Mar 10, 2012 at 12:47:27PM +0100, Igor Mammedov wrote:
Signed-off-by: Igor Mammedov imammedo@redhat.com
src/paravirt.c | 12 ++++++++++++ src/paravirt.h | 2 ++ 2 files changed, 14 insertions(+), 0 deletions(-)
diff --git a/src/paravirt.c b/src/paravirt.c index 9cf77de..c2bd0a8 100644 --- a/src/paravirt.c +++ b/src/paravirt.c @@ -305,6 +305,18 @@ u16 qemu_cfg_get_max_cpus(void) return cnt; }
+u16 qemu_cfg_have_acpi_cpus_map(void) +{
- u16 cnt;
- if (!qemu_cfg_present)
return 0;
- qemu_cfg_read_entry(&cnt, QEMU_CFG_HAVE_ACPI_CPUS_MAP, sizeof(cnt));
- return cnt;
+}
Please use the "file" interface for passing new variables into seabios. This enables the seabios code to look something like:
romfile_loadint("etc/have_acpi_map", 0);
with no need to write per-file handlers.
-Kevin
On Sat, Mar 10, 2012 at 12:47:28PM +0100, Igor Mammedov wrote:
Initial count of active cpus is communicated to bios from qemu via CMOS_BIOS_SMP_COUNT io port. However if cpus are hotplugged after boot and then guest is rebooted without taking down qemu then bios might be stuck at smp_probe
[...]
--- a/src/smp.c +++ b/src/smp.c @@ -17,6 +17,9 @@
#define APIC_ENABLED 0x0100
+#define ACPI_CPU_STATUS_MAP 0xaf00 +#define ACPI_CPU_STATUS_MAP_SZ 32
struct { u32 ecx, eax, edx; } smp_mtrr[32] VAR16VISIBLE; u32 smp_mtrr_count VAR16VISIBLE;
@@ -115,6 +118,26 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
dprintf(1, "Powered-on with %d cpu(s)\n", cmos_smp_count + 1);
if (qemu_cfg_have_acpi_cpus_map()) {
[...]
Please don't clutter up smp.c with this. The code should look something like:
int total_cpus = getCPUcount(); while (readl(&CountCPUs) < total_cpus) yield();
The getCPUcount() can be buried in paravirt.c, acpi.c, or where ever it makes sense.
BTW, why do we have to call qemu_cfg_have_acpi_cpus_map? Can't this just be inferred if the data at inb(0xaf00) is all zeros?
-Kevin
On Sat, Mar 10, 2012 at 12:47:25PM +0100, Igor Mammedov wrote:
- Halt if number of started cpus are more then expected
- Allow bios to detect if qemu has cpus status map for acpi hotplug
- Take in account hot(un)plugged cpus on reboot
Currently, the seabios code in acpi.c and mptable.c assume the first CountCPUs of MaxCountCPUs are on. Your patches update the reboot cpu detection, but they don't update the reporting of online cpus in the bios tables that are subsequently built.
-Kevin
On 03/13/2012 12:51 AM, Kevin O'Connor wrote:
On Sat, Mar 10, 2012 at 12:47:26PM +0100, Igor Mammedov wrote:
Reduce amount of consumed cpu time (i.e. don't spin forever) if number of started cpus are more then expected. And print a bug message into debug port.
Signed-off-by: Igor Mammedovimammedo@redhat.com
src/smp.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/src/smp.c b/src/smp.c index 8c077a1..9933ac6 100644 --- a/src/smp.c +++ b/src/smp.c @@ -115,8 +115,14 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
while (cmos_smp_count + 1 != readl(&CountCPUs))
while (cmos_smp_count + 1 != readl(&CountCPUs)) {
if (cmos_smp_count + 1< readl(&CountCPUs)) {
dprintf(1, "BUG: Expected %d cpu(s) but %d cpus started\n",
cmos_smp_count + 1, readl(&CountCPUs));
hlt();
} yield();
}
I suggest just changing the "while !=" to a "while<". The chance of detecting the error is pretty slim anyway. (Under normal circumstances, the processor is spinning on the count - the chance of two other processors getting in to increment before the first processor sees a change is pretty small.)
Then with buggy emulators (i.e. actual started number of cpus > cmos_smp_count + 1), it is possible that we could continue to boot while CountCPUs is still increasing. And as result we could build mp/acpi/smbios tables with different count. It's safer to stop then deal with consequences later.
This patch is only for purpose of increasing stability of host, i.e. do not consume needlessly cpu and affect other VMs if bug creeps in used emulator.
-Kevin
On 03/13/2012 01:08 AM, Peter Stuge wrote:
Igor Mammedov wrote:
not even sure we should halt at all. May be warn the user and continue
BIOSes usually stop in case of error
That doesn't mean that it's a good idea.
You might be right, however I'd call it expected behavior.
Imagine VM that need processing power of several cpus to provide adequate service. During its runtime several cpus were hot-plugged and it worked like a charm. Then during scheduled maintenance VM was soft-rebooted and it seems that VM is fully up (nobody noticed that cpus # are less then expected) and promoted to servicing usual load. If I were admin of this vm, I'd rather fix easy to spot problem during maintenance time than find out later that VM is under-powered and deal with angry customers.
//Peter
On 03/13/2012 01:09 AM, Kevin O'Connor wrote:
On Sat, Mar 10, 2012 at 12:47:28PM +0100, Igor Mammedov wrote:
Initial count of active cpus is communicated to bios from qemu via CMOS_BIOS_SMP_COUNT io port. However if cpus are hotplugged after boot and then guest is rebooted without taking down qemu then bios might be stuck at smp_probe
[...]
--- a/src/smp.c +++ b/src/smp.c @@ -17,6 +17,9 @@
#define APIC_ENABLED 0x0100
+#define ACPI_CPU_STATUS_MAP 0xaf00 +#define ACPI_CPU_STATUS_MAP_SZ 32
- struct { u32 ecx, eax, edx; } smp_mtrr[32] VAR16VISIBLE; u32 smp_mtrr_count VAR16VISIBLE;
@@ -115,6 +118,26 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
dprintf(1, "Powered-on with %d cpu(s)\n", cmos_smp_count + 1);
if (qemu_cfg_have_acpi_cpus_map()) {
[...]
Please don't clutter up smp.c with this. The code should look something like:
int total_cpus = getCPUcount(); while (readl(&CountCPUs)< total_cpus) yield();
The getCPUcount() can be buried in paravirt.c, acpi.c, or where ever it makes sense.
Thanks Kevin, I'll amend and repost patch-set taking in account your suggestions.
BTW, why do we have to call qemu_cfg_have_acpi_cpus_map? Can't this just be inferred if the data at inb(0xaf00) is all zeros?
What if qemu doesn't have 0xaf00 (like current upstream)? cfg_have_acpi_cpus_map is just a flag that allows to detect if qemu provides 0xaf00. This way it would be possible for seabios to work with old and new versions of qemu without breaking anything.
-Kevin
On 03/13/2012 01:18 AM, Kevin O'Connor wrote:
On Sat, Mar 10, 2012 at 12:47:25PM +0100, Igor Mammedov wrote:
- Halt if number of started cpus are more then expected
- Allow bios to detect if qemu has cpus status map for acpi hotplug
- Take in account hot(un)plugged cpus on reboot
Currently, the seabios code in acpi.c and mptable.c assume the first CountCPUs of MaxCountCPUs are on. Your patches update the reboot cpu detection, but they don't update the reporting of online cpus in the bios tables that are subsequently built.
I'm planing on fixing tables creation later when hot-plugging consecutive cpus is in upstream qemu and I'll be working on nonconsecutive cpu (un)plug.
Goal of this patches is to fix hang after reboot if cpu(s) were hot-pluged.
-Kevin
On Tue, Mar 13, 2012 at 10:37:54AM +0100, Igor Mammedov wrote:
On 03/13/2012 01:09 AM, Kevin O'Connor wrote:
On Sat, Mar 10, 2012 at 12:47:28PM +0100, Igor Mammedov wrote:
Initial count of active cpus is communicated to bios from qemu via CMOS_BIOS_SMP_COUNT io port. However if cpus are hotplugged after boot and then guest is rebooted without taking down qemu then bios might be stuck at smp_probe
[...]
--- a/src/smp.c +++ b/src/smp.c @@ -17,6 +17,9 @@
#define APIC_ENABLED 0x0100
+#define ACPI_CPU_STATUS_MAP 0xaf00 +#define ACPI_CPU_STATUS_MAP_SZ 32
struct { u32 ecx, eax, edx; } smp_mtrr[32] VAR16VISIBLE; u32 smp_mtrr_count VAR16VISIBLE;
@@ -115,6 +118,26 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
dprintf(1, "Powered-on with %d cpu(s)\n", cmos_smp_count + 1);
if (qemu_cfg_have_acpi_cpus_map()) {
[...]
Please don't clutter up smp.c with this. The code should look something like:
int total_cpus = getCPUcount(); while (readl(&CountCPUs)< total_cpus) yield();
The getCPUcount() can be buried in paravirt.c, acpi.c, or where ever it makes sense.
Thanks Kevin, I'll amend and repost patch-set taking in account your suggestions.
BTW, why do we have to call qemu_cfg_have_acpi_cpus_map? Can't this just be inferred if the data at inb(0xaf00) is all zeros?
We can't rely on the fact that unused port returns all zeros. Actually on real HW I think it return all ones.
What if qemu doesn't have 0xaf00 (like current upstream)? cfg_have_acpi_cpus_map is just a flag that allows to detect if qemu provides 0xaf00. This way it would be possible for seabios to work with old and new versions of qemu without breaking anything.
If QEMU will update cmos cpu count on cpu (un)plug we will not have to introduce new PV interface to telll SeaBIOS about 0xaf00.
-- Gleb.
On 03/13/2012 10:51 AM, Gleb Natapov wrote:
On Tue, Mar 13, 2012 at 10:37:54AM +0100, Igor Mammedov wrote:
On 03/13/2012 01:09 AM, Kevin O'Connor wrote:
On Sat, Mar 10, 2012 at 12:47:28PM +0100, Igor Mammedov wrote:
Initial count of active cpus is communicated to bios from qemu via CMOS_BIOS_SMP_COUNT io port. However if cpus are hotplugged after boot and then guest is rebooted without taking down qemu then bios might be stuck at smp_probe
[...]
--- a/src/smp.c +++ b/src/smp.c @@ -17,6 +17,9 @@
#define APIC_ENABLED 0x0100
+#define ACPI_CPU_STATUS_MAP 0xaf00 +#define ACPI_CPU_STATUS_MAP_SZ 32
- struct { u32 ecx, eax, edx; } smp_mtrr[32] VAR16VISIBLE; u32 smp_mtrr_count VAR16VISIBLE;
@@ -115,6 +118,26 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
dprintf(1, "Powered-on with %d cpu(s)\n", cmos_smp_count + 1);
if (qemu_cfg_have_acpi_cpus_map()) {
[...]
Please don't clutter up smp.c with this. The code should look something like:
int total_cpus = getCPUcount(); while (readl(&CountCPUs)< total_cpus) yield();
The getCPUcount() can be buried in paravirt.c, acpi.c, or where ever it makes sense.
Thanks Kevin, I'll amend and repost patch-set taking in account your suggestions.
BTW, why do we have to call qemu_cfg_have_acpi_cpus_map? Can't this just be inferred if the data at inb(0xaf00) is all zeros?
We can't rely on the fact that unused port returns all zeros. Actually on real HW I think it return all ones.
What if qemu doesn't have 0xaf00 (like current upstream)? cfg_have_acpi_cpus_map is just a flag that allows to detect if qemu provides 0xaf00. This way it would be possible for seabios to work with old and new versions of qemu without breaking anything.
If QEMU will update cmos cpu count on cpu (un)plug we will not have to introduce new PV interface to telll SeaBIOS about 0xaf00.
But we will have to introduce it anyway in future for correct building of mp/acpi/smbios tables. So why not do it now and be ready later for patches to mp/acpi/smbios tables.
-- Gleb.
On Tue, Mar 13, 2012 at 11:16:51AM +0100, Igor Mammedov wrote:
On 03/13/2012 10:51 AM, Gleb Natapov wrote:
On Tue, Mar 13, 2012 at 10:37:54AM +0100, Igor Mammedov wrote:
On 03/13/2012 01:09 AM, Kevin O'Connor wrote:
On Sat, Mar 10, 2012 at 12:47:28PM +0100, Igor Mammedov wrote:
Initial count of active cpus is communicated to bios from qemu via CMOS_BIOS_SMP_COUNT io port. However if cpus are hotplugged after boot and then guest is rebooted without taking down qemu then bios might be stuck at smp_probe
[...]
--- a/src/smp.c +++ b/src/smp.c @@ -17,6 +17,9 @@
#define APIC_ENABLED 0x0100
+#define ACPI_CPU_STATUS_MAP 0xaf00 +#define ACPI_CPU_STATUS_MAP_SZ 32
struct { u32 ecx, eax, edx; } smp_mtrr[32] VAR16VISIBLE; u32 smp_mtrr_count VAR16VISIBLE;
@@ -115,6 +118,26 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
dprintf(1, "Powered-on with %d cpu(s)\n", cmos_smp_count + 1);
if (qemu_cfg_have_acpi_cpus_map()) {
[...]
Please don't clutter up smp.c with this. The code should look something like:
int total_cpus = getCPUcount(); while (readl(&CountCPUs)< total_cpus) yield();
The getCPUcount() can be buried in paravirt.c, acpi.c, or where ever it makes sense.
Thanks Kevin, I'll amend and repost patch-set taking in account your suggestions.
BTW, why do we have to call qemu_cfg_have_acpi_cpus_map? Can't this just be inferred if the data at inb(0xaf00) is all zeros?
We can't rely on the fact that unused port returns all zeros. Actually on real HW I think it return all ones.
What if qemu doesn't have 0xaf00 (like current upstream)? cfg_have_acpi_cpus_map is just a flag that allows to detect if qemu provides 0xaf00. This way it would be possible for seabios to work with old and new versions of qemu without breaking anything.
If QEMU will update cmos cpu count on cpu (un)plug we will not have to introduce new PV interface to telll SeaBIOS about 0xaf00.
But we will have to introduce it anyway in future for correct building of mp/acpi/smbios tables. So why not do it now and be ready later for patches to mp/acpi/smbios tables.
Good point. You mean we need a list of apic ids of available cpus right? Can SeaBIOS build this list by reading apic id in smp_ap_boot_code?
-- Gleb.
Hi,
On Tue, Mar 13, 2012 at 11:16:51AM +0100, Igor Mammedov wrote:
What if qemu doesn't have 0xaf00 (like current upstream)? cfg_have_acpi_cpus_map is just a flag that allows to detect if qemu provides 0xaf00. This way it would be possible for seabios to work with old and new versions of qemu without breaking anything.
If QEMU will update cmos cpu count on cpu (un)plug we will not have to introduce new PV interface to telll SeaBIOS about 0xaf00.
But we will have to introduce it anyway in future for correct building of mp/acpi/smbios tables. So why not do it now and be ready later for patches to mp/acpi/smbios tables.
This sounds reasonable in principle. I am not sure about mp/smbios tables, but note that ACPI SRAT tables already use PV Interface qemu_cfg_get_numa_data.
thanks,
- Vasilis
----- Original Message -----
From: "Gleb Natapov" gleb@redhat.com To: "Igor Mammedov" imammedo@redhat.com Cc: "Kevin O'Connor" kevin@koconnor.net, "jan kiszka" jan.kiszka@siemens.com, seabios@seabios.org Sent: Tuesday, March 13, 2012 11:21:28 AM Subject: Re: [SeaBIOS] [PATCH 3/3] Take in account hot(un)plugged cpus on reboot
On Tue, Mar 13, 2012 at 11:16:51AM +0100, Igor Mammedov wrote:
On 03/13/2012 10:51 AM, Gleb Natapov wrote:
On Tue, Mar 13, 2012 at 10:37:54AM +0100, Igor Mammedov wrote:
On 03/13/2012 01:09 AM, Kevin O'Connor wrote:
On Sat, Mar 10, 2012 at 12:47:28PM +0100, Igor Mammedov wrote:
Initial count of active cpus is communicated to bios from qemu via CMOS_BIOS_SMP_COUNT io port. However if cpus are hotplugged after boot and then guest is rebooted without taking down qemu then bios might be stuck at smp_probe
[...]
--- a/src/smp.c +++ b/src/smp.c @@ -17,6 +17,9 @@
#define APIC_ENABLED 0x0100
+#define ACPI_CPU_STATUS_MAP 0xaf00 +#define ACPI_CPU_STATUS_MAP_SZ 32
struct { u32 ecx, eax, edx; } smp_mtrr[32] VAR16VISIBLE; u32 smp_mtrr_count VAR16VISIBLE;
@@ -115,6 +118,26 @@ smp_probe(void) msleep(10); } else { u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
dprintf(1, "Powered-on with %d cpu(s)\n",
cmos_smp_count + 1);
if (qemu_cfg_have_acpi_cpus_map()) {
[...]
Please don't clutter up smp.c with this. The code should look something like:
int total_cpus = getCPUcount(); while (readl(&CountCPUs)< total_cpus) yield();
The getCPUcount() can be buried in paravirt.c, acpi.c, or where ever it makes sense.
Thanks Kevin, I'll amend and repost patch-set taking in account your suggestions.
BTW, why do we have to call qemu_cfg_have_acpi_cpus_map? Can't this just be inferred if the data at inb(0xaf00) is all zeros?
We can't rely on the fact that unused port returns all zeros. Actually on real HW I think it return all ones.
What if qemu doesn't have 0xaf00 (like current upstream)? cfg_have_acpi_cpus_map is just a flag that allows to detect if qemu provides 0xaf00. This way it would be possible for seabios to work with old and new versions of qemu without breaking anything.
If QEMU will update cmos cpu count on cpu (un)plug we will not have to introduce new PV interface to telll SeaBIOS about 0xaf00.
But we will have to introduce it anyway in future for correct building of mp/acpi/smbios tables. So why not do it now and be ready later for patches to mp/acpi/smbios tables.
Good point. You mean we need a list of apic ids of available cpus right? Can SeaBIOS build this list by reading apic id in smp_ap_boot_code?
I've looked at how tables are build. It seems that it's not hard and sufficient to build an equivalent temporary byte array of (un)available apic ids that corresponds to 0xaf00 bitmap at boot time, under following restriction: - apic id shall be in range [0..max_cpus) * this will allow to allocate array where apic id will serve as index, thus making it equvivalent to 0xaf00 bitmap. * and easily fill this array in smp_ap_boot_code without inventing and using mutexes/spinlocks in smp_ap_boot_code.
This will require cooperation from qemu side. apic id could be checked somewhere in device_add command handler call chain or at cpu object creation time.
Jan, What do you think about setting such restriction on apic id of cpus in qemu?
-- Gleb.