[SeaBIOS] varlow/extrastack vs code

Kevin O'Connor kevin at koconnor.net
Tue Feb 14 17:22:10 CET 2017


On Tue, Feb 14, 2017 at 02:50:23PM +0000, Dr. David Alan Gilbert wrote:
> I've been sporadically carrying on debugging this and I
> think I have a bit more understanding, but not quite all the way.
> 
> I'm pretty sure we are trying to run code that's been over written
> by variable data - which I believe to be TimerLast, and I'm also
> fairly sure the problem is not a delayed reset from multiple reboot signals.
> 
> Some detail:
> a) I added some debug to qemu to print what triggered the reboot by
>   always passing qemu_system_reset_request a string; the kernel
>   is always rebooting via KBD_CCMD_RESET, I see that enter the BIOS
>   and then see SeaBIOS then does KBD_CCMD_RESET again - which I think
>   is the one coming from handle_resume32.
>   But then I noticed that every so often we got a reboot from a KVM_SHUTDOWN
>   which is apparently a triple-fault or the like.
> 
> b) With a KVM trace I got the addresses the fault was happening at,
>   and it was always an address in the copy fo the bios around bffbcf..
>   although the actual failure jumped about a bit, taking the:
>     Relocating init from 0x000e4810 to 0xbffb2050 (size 57120)
>   I worked the fault address back to 'ef799' and from the objdump
>   found that was an address in maininit but also corresponded to
>   TimerLast:

I can confirm there is a defect with TimerLast and reboot handling.
I'm not sure what the best fix is - suggestions welcome.

The root of the problem is that QEMU does not reset the f-segment on a
reboot signal and so SeaBIOS must manually reset that memory.  It does
this in the call to qemu_prep_reset() in tryReboot().  That call
copies the pristine copy of the bios at 0xffff0000 to 0xf0000.  The
code then goes on to signal a hard reboot so that all the hardware is
reset and so that the code starts just like it would on first boot.

Unfortunately, the hard reboot logic makes calls to udelay() -
specifically in i8042_reboot() and pci_reboot().  These calls to
udelay() are not valid here as the timer isn't even necessarily
configured at this point.  Alas - it also has the nasty possibility of
corrupting the pristine copy of the bios that was just made in
qemu_prep_reset().  Random crashes can then occur on the next boot.

It's easy to catch this with the following code change:

--- a/src/hw/timer.c
+++ b/src/hw/timer.c
@@ -146,6 +146,8 @@ timer_adjust_bits(u32 value, u32 validbits)
 static u32
 timer_read(void)
 {
+    if (!GET_GLOBAL(HaveRunPost))
+        panic("timer read too early\n");
     u16 port = GET_GLOBAL(TimerPort);
     if (CONFIG_TSC_TIMER && !port)
         // Read from CPU TSC

With that, I get a panic on every reboot.  Commenting out the udelay()
calls then gets reboots working again.

The simplest fix is to remove the udelay() calls from the reboot
logic.  However, it's not clear to me if it's valid to omit the delays
when running on real hardware.  Another possibility would be to try
and harden timer_read() so that it can be run very early (prior to
code relocation), but that's tricky to implement correctly..

Good find Dave - I know that wasn't easy to track down.
-Kevin



More information about the SeaBIOS mailing list