[SeaBIOS] [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
stone.xulei at huawei.com
Fri Nov 6 10:12:34 CET 2015
>On Wed, Nov 04, 2015 at 08:48:20AM +0800, Gonglei wrote:
>> On 2015/11/3 14:58, Xulei (Stone, Euler) wrote:
>> > On qemu-kvm platform, when I reset a VM through "virsh reset", and coincidently
>> > the VM is in process of internal rebooting at the same time. Then the VM will
>> > not be successfully reseted any more due to the reset reentrancy. I found:
>> > (1)SeaBios try to shutdown the VM after reseting it failed by apm_shutdown().
>> > However, apm_shutdown() does not work on qemu-kvm platform;
>> > (2)I add 1s sleep in qemu_prep_reset(), then continuously reset the VM twice,
>> > aforementioned case must happen.
>So, the problem occurs when issuing a second reset before the first
Yes. Detailedly, the 2nd reset issued after "HaveAttemptedReboot = 1"
and prior to the memcpy completing in qemu_prep_reset().
>> > This patch fixes this issue by letting the VM always execute the reboot
>> > routing while a reenrancy happenes instead of attempting apm_shutdown on
>> > qemu-kvm platform.
>The reason for the HaveAttemptedReboot check is to work around old
>versions of KVM that unexpectedly map the same memory to both 0xf0000
>and 0xffff0000. So, it does not make sense to wrap the check in a
>!runningOnKVM() block as that disables the only reason for the check.
>I'm surprised you would see the above on a recent qemu/kvm though - as
>on a newer KVM I think the second reset would have to happen after
>HaveAttemptedReboot is set and prior to the memcpy in
>qemu_prep_reset() completing. Can you verify your KVM version?
I've tested on KVM-3.6 and KVM-4.1.3. On both of these versions, i can
see this problem.
I do like this: put a HA and a watchdog mechanism in a VM. Deliberately,
let this VM lose heartbeat and don't feed dog. Then, after 2 minutes,
a self-defined timeout, HA mechnism will issue a internal reboot command to
the VM and watchdog mechanism will issue a "virsh reset" from the host. Then,
aforementioned problem will occurs in high probability.
More information about the SeaBIOS