On Thu, Nov 19, 2015 at 12:42:50PM +0000, Xulei (Stone) wrote:
> Kevin,
>
> After deeply analyzing, i think there may be 3 possible reasons:
> 1)wrong CountCPUs value. It seems CountCPUs++ in handle_smp() has no
> lock to protect. So, sometimes, 2 or more vcpu may get the same
> current value of CountCPUs. Then we'll get a single incrementation
> instead of 2 or more and "while (cmos_smp_count != CountCPUs)" will
> loop forever;
The handle_smp() code is called from romlayout.S:entry_smp() which
does take a lock. So, all of handle_smp() should run synchronous.
> 2)wrong cmos_smp_count value. SeaBIOS rtc reads an incorrect number?
Not sure - the last time there were problems in this area of the code
others used kvmtrace to try and track this down. Since you are
getting dprintf statements, you could also try outputting
cmos_smp_count prior to the loop (see patch below).
> 3)yield() stuck. Is it possible that SeaBIOS is stuck during yield?
> I've tested, when yield() is running, SeaBIOS seems has not created
> some other threads except the main thread. So I don't know what's
> the function of yield() here.?
The yield() allows hardware interrupts to occur. But note that
yield() isn't called in the loop - is is only called after the loop
completes.
If you are only getting this on massive repetitive reboot requests,
there are some other possible explanations:
- perhaps the SIPI is getting lost because one of the CPUs is still
resetting or still processing a SIPI from the last reboot?
- the seabios code itself may have been corrupted if the memcpy() in
qemu_prep_reset() got far enough along to clear HaveRunPost, but did
not get far enough along to fully complete the memcpy().
If the failure is reproducible, the patch below could help narrow the
possibilities.
-Kevin
--- a/src/fw/smp.c
+++ b/src/fw/smp.c
@@ -125,6 +125,7 @@ smp_setup(void)
// Wait for other CPUs to process the SIPI.
u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
+ dprintf(1, "cmos_smp_count=%d\n", cmos_smp_count);
while (cmos_smp_count != CountCPUs)
asm volatile(
// Release lock and allow other processors to use the stack.
@@ -136,6 +137,7 @@ smp_setup(void)
" jc 1b\n"
: "+m" (SMPLock), "+m" (SMPStack)
: : "cc", "memory");
+ dprintf(1, "finish smp\n");
yield();
// Restore memory.