[SeaBIOS] [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
arei.gonglei at huawei.com
Mon Dec 21 10:41:32 CET 2015
> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin at koconnor.net]
> Sent: Sunday, December 20, 2015 10:33 PM
> To: Gonglei (Arei)
> Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios at seabios.org;
> Huangweidong (C); kvm at vger.kernel.org; Radim Krcmar
> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> problem on qemu-kvm platform
> On Sun, Dec 20, 2015 at 09:49:54AM +0000, Gonglei (Arei) wrote:
> > > From: Kevin O'Connor [mailto:kevin at koconnor.net]
> > > Sent: Saturday, December 19, 2015 11:12 PM
> > > On Sat, Dec 19, 2015 at 12:03:15PM +0000, Gonglei (Arei) wrote:
> > > > Maybe the root cause is not NMI but INTR, so yield() can open hardware
> > > interrupt,
> > > > And then execute interrupt handler, but the interrupt handler make the
> > > SeaBIOS
> > > > stack broken, so that the BSP can't execute the instruction and occur
> > > exception,
> > > > VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs
> > > > the surface phenomenon.
> > >
> > > I can't see any reason why allowing interrupts at this location would
> > > be a problem.
> > >
> > Does it have any relationship with *extra stack* of SeaBIOS?
> None that I can see. Also, the kvm trace seems to show the code
> trying to execute at rip=0x03 - that will crash long before the extra
> stack is used.
When the gurb of OS is booting, then the softirq and C function send_disk_op()
may use extra stack of SeaBIOS. If we inject a NMI, romlayout.S: irqentry_extrastack
is invoked, and the extra stack will be used again. And the stack of first calling
will be broken, so that the SeaBIOS stuck.
You can easily reproduce the problem.
1. start on guest
2. reset the guest
3. inject a NMI when the guest show the grub surface
4. then the guest stuck
If we disabled extra stack by setting
Then the problem is gone.
Besides, I have another thought:
Is it possible when one cpu is using the extra stack, but other cpus (APs)
still be waked up by hardware interrupt after yield() or br->flags = F_IF
and used the extra stack again?
> > > > Kevin, can we drop yield() in smp_setup() ?
> > >
> > > It's possible to eliminate this instance of yield, but I think it
> > > would just push the crash to the next time interrupts are enabled.
> > >
> > Perhaps. I'm not sure.
> > > > Is it really useful and allowable for SeaBIOS? Maybe for other
> > > > I'm not sure. Because we found that when SeaBIOS is booting, if we inject
> > > > NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same
> > > > the current problem.
> > >
> > > If you apply the patches you had to prevent that NMI crash problem,
> > > does it also prevent the above crash?
> > >
> > Yes, but we cannot prevent the NMI injection (though I'll submit some
> patches to
> > forbid users' NMI injection after NMI_EN disabled by RTC bit7 of port 0x70).
More information about the SeaBIOS