[SeaBIOS] varlow/extrastack vs code

Dr. David Alan Gilbert dgilbert at redhat.com
Tue Feb 14 18:14:03 CET 2017


* Kevin O'Connor (kevin at koconnor.net) wrote:
> On Tue, Feb 14, 2017 at 02:50:23PM +0000, Dr. David Alan Gilbert wrote:
> > I've been sporadically carrying on debugging this and I
> > think I have a bit more understanding, but not quite all the way.
> > 
> > I'm pretty sure we are trying to run code that's been over written
> > by variable data - which I believe to be TimerLast, and I'm also
> > fairly sure the problem is not a delayed reset from multiple reboot signals.
> > 
> > Some detail:
> > a) I added some debug to qemu to print what triggered the reboot by
> >   always passing qemu_system_reset_request a string; the kernel
> >   is always rebooting via KBD_CCMD_RESET, I see that enter the BIOS
> >   and then see SeaBIOS then does KBD_CCMD_RESET again - which I think
> >   is the one coming from handle_resume32.
> >   But then I noticed that every so often we got a reboot from a KVM_SHUTDOWN
> >   which is apparently a triple-fault or the like.
> > 
> > b) With a KVM trace I got the addresses the fault was happening at,
> >   and it was always an address in the copy fo the bios around bffbcf..
> >   although the actual failure jumped about a bit, taking the:
> >     Relocating init from 0x000e4810 to 0xbffb2050 (size 57120)
> >   I worked the fault address back to 'ef799' and from the objdump
> >   found that was an address in maininit but also corresponded to
> >   TimerLast:
> 

Hi Kevin,

> I can confirm there is a defect with TimerLast and reboot handling.
> I'm not sure what the best fix is - suggestions welcome.

  Thanks for confirming my diagnosis so far.

> The root of the problem is that QEMU does not reset the f-segment on a
> reboot signal and so SeaBIOS must manually reset that memory.

If you need a qemu fix then lets just figure out the right fix.

>  It does
> this in the call to qemu_prep_reset() in tryReboot().  That call
> copies the pristine copy of the bios at 0xffff0000 to 0xf0000.  The
> code then goes on to signal a hard reboot so that all the hardware is
> reset and so that the code starts just like it would on first boot.
> 
> Unfortunately, the hard reboot logic makes calls to udelay() -
> specifically in i8042_reboot() and pci_reboot().  These calls to
> udelay() are not valid here as the timer isn't even necessarily
> configured at this point.  Alas - it also has the nasty possibility of
> corrupting the pristine copy of the bios that was just made in
> qemu_prep_reset().  Random crashes can then occur on the next boot.
> 
> It's easy to catch this with the following code change:
> 
> --- a/src/hw/timer.c
> +++ b/src/hw/timer.c
> @@ -146,6 +146,8 @@ timer_adjust_bits(u32 value, u32 validbits)
>  static u32
>  timer_read(void)
>  {
> +    if (!GET_GLOBAL(HaveRunPost))
> +        panic("timer read too early\n");
>      u16 port = GET_GLOBAL(TimerPort);
>      if (CONFIG_TSC_TIMER && !port)
>          // Read from CPU TSC
> 
> With that, I get a panic on every reboot.  Commenting out the udelay()
> calls then gets reboots working again.
> 
> The simplest fix is to remove the udelay() calls from the reboot
> logic.  However, it's not clear to me if it's valid to omit the delays
> when running on real hardware.

Yes, it would be scary to modify that because then you'd be stuck
with not quite knowing how many hardware implementations were fussy
about it - although as you say it's dependent on that read before
the timer is configured, so it's not safe anyway.

Wasn't one of the old tricks for a delay to repeatedly read an ISA location
on the assumption if the hardware was old then it would be slow and if
the hardware wasn't as old it probably wouldn't be as fussy about
the delay?

> Another possibility would be to try
> and harden timer_read() so that it can be run very early (prior to
> code relocation), but that's tricky to implement correctly..

I guess I could hack around it for my use by just removing the delay;
Would preserving TimerLast in i8042_reboot/pci_reboot around the udelays
work?

> Good find Dave - I know that wasn't easy to track down.

Thanks; as heisenbugs go it's a pretty nasty one; still learnt a bit
about the innards of seabios :-)

Dave

> -Kevin
--
Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK



More information about the SeaBIOS mailing list