From: Kevin O'Connor [mailto:email@example.com]
On Fri, Dec 18, 2015 at 03:04:58AM +0000, Gonglei (Arei) wrote:
Hi Kevin & Paolo,
Luckily, I reproduced this problem last night. And I got the below log when
SeaBIOS is stuck.
[2015-12-18 10:38:10] gonglei: finish while
<...>-31509  154753.180077: kvm_exit:
reason EXCEPTION_NMI rip 0x3
info 0 80000306
<...>-31509  154753.180077:
kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509  154753.180077: kvm_inj_exception: #UD (0x0)
<...>-31509  154753.180077: kvm_entry: vcpu 0
This is an odd finding. It seems to indicate that the code is caught
in an infinite irq loop once irqs are enabled. What doesn't make
sense is that an NMI shouldn't depend on the cpu irq enable flag.
Maybe the root cause is not NMI but INTR, so yield() can open hardware interrupt,
And then execute interrupt handler, but the interrupt handler make the SeaBIOS
stack broken, so that the BSP can't execute the instruction and occur exception,
VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs except
the surface phenomenon.
Kevin, can we drop yield() in smp_setup() ?
diff --git a/src/fw/smp.c b/src/fw/smp.c
index 579acdb..dd23eda 100644
@@ -136,7 +136,6 @@ smp_setup(void)
" jc 1b\n"
: "+m" (SMPLock), "+m" (SMPStack)
: : "cc", "memory");
// Restore memory.
*(u64*)BUILD_AP_BOOT_ADDR = old;
Is it really useful and allowable for SeaBIOS? Maybe for other components?
I'm not sure. Because we found that when SeaBIOS is booting, if we inject a
NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same with
the current problem.
Also, I can't explain why rip would be 0x03, nor
why a #UD in an
exception handler wouldn't result in a triple fault. Maybe someone
with more kvm knowledge could help here.
I did notice that you appear to be running with SeaBIOS v1.8.1 - I
recommend you upgrade to the latest. There were two important fixes
in this area (8b9942fa and 3156b71a). I don't think either of these
fixes would explain the log above, but it would be best to eliminate