Re: [SeaBIOS] [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform

21 Dec 2015


      Dear Kevin,
...
-----Original Message-----
From: Kevin O'Connor [mailto:kevin@koconnor.net]
Sent: Sunday, December 20, 2015 10:33 PM
To: Gonglei (Arei)
Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
problem on qemu-kvm platform
On Sun, Dec 20, 2015 at 09:49:54AM +0000, Gonglei (Arei) wrote:
...
...
From: Kevin O'Connor [mailto:kevin@koconnor.net]
Sent: Saturday, December 19, 2015 11:12 PM
On Sat, Dec 19, 2015 at 12:03:15PM +0000, Gonglei (Arei) wrote:
...
Maybe the root cause is not NMI but INTR, so yield() can open hardware
interrupt,
...
And then execute interrupt handler, but the interrupt handler make the
SeaBIOS
...
stack broken, so that the BSP can't execute the instruction and occur
exception,
...
VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs
except
...
...
...
the surface phenomenon.
I can't see any reason why allowing interrupts at this location would
be a problem.
Does it have any relationship with *extra stack* of SeaBIOS?
None that I can see.  Also, the kvm trace seems to show the code
trying to execute at rip=0x03 - that will crash long before the extra
stack is used.
When the gurb of OS is booting, then the softirq and C function send_disk_op()
may use extra stack of SeaBIOS. If we inject a NMI, romlayout.S: irqentry_extrastack
is invoked, and the extra stack will be used again. And the stack of first calling
will be broken, so that the SeaBIOS stuck.
You can easily reproduce the problem.
1. start on guest
2. reset the guest
3. inject a NMI when the guest show the grub surface
4. then the guest stuck
If we disabled extra stack by setting
CONFIG_ENTRY_EXTRASTACK=n
Then the problem is gone.
Besides, I have another thought:
Is it possible when one cpu is using the extra stack, but other cpus (APs)
still be waked up by hardware interrupt after yield() or br->flags = F_IF 
and used the extra stack again?
Regards,
-Gonglei
...
...
...
...
Kevin, can we drop yield() in smp_setup() ?
It's possible to eliminate this instance of yield, but I think it
would just push the crash to the next time interrupts are enabled.
Perhaps. I'm not sure.
...
...
Is it really useful and allowable for SeaBIOS? Maybe for other
components?
...
...
...
I'm not sure. Because we found that when SeaBIOS is booting, if we inject
a
...
...
...
NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same
with
...
...
...
the current problem.
If you apply the patches you had to prevent that NMI crash problem,
does it also prevent the above crash?
Yes, but we cannot prevent the NMI injection (though I'll submit some
patches to
...
forbid users' NMI injection after NMI_EN disabled by RTC bit7 of port 0x70).
-Kevin

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [SeaBIOS] [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform