[SeaBIOS] varlow/extrastack vs code

Kevin O'Connor kevin at koconnor.net
Mon Jan 23 20:30:27 CET 2017


On Mon, Jan 23, 2017 at 06:49:07PM +0000, Dr. David Alan Gilbert wrote:
> * Laszlo Ersek (lersek at redhat.com) wrote:
> > On 01/23/17 16:49, Kevin O'Connor wrote:
> > > On Mon, Jan 23, 2017 at 11:11:02AM +0100, Laszlo Ersek wrote:
> > >> On 01/20/17 20:39, Dr. David Alan Gilbert wrote:
> > >>> * Kevin O'Connor (kevin at koconnor.net) wrote:
> > >>>> On Fri, Jan 20, 2017 at 06:40:44PM +0000, Dr. David Alan Gilbert wrote:
> > >>>>> Hi,
> > >>>>>   I turned the debug level up to 4 on our smaller (128k) ROM downstream
> > >>>>> build and seem to have hit a case where it's been layed out so that the
> > >>>>> 'ExtraStack' is at the same location as some code (display_uuid) which
> > >>>>> was causing some very random behaviour;
> > > [...]
> > >> Would this be consistent with a stack overflow?
> > >>
> > >> See commit 46b82624c95b951e8825fab117d9352faeae0ec8. Perhaps
> > >> BUILD_EXTRA_STACK_SIZE (2KB) is too small now?
> > > 
> > > The ExtraStack isn't used at the point Dave reports the problem -
> > > display_uuid() is part of the init phase and that happens on the main
> > > "post" stack.
> > > 
> > > [...]
> > >> (This is based off 1.9.1)
> > > 
> > > I missed that earlier - there were some important fixes post 1.9.1 wrt
> > > reboots.  Commits b837e68d / a48f602c2 could explain the issue.  I'd
> > > make sure the issue is still present on the latest version.
> > 
> > That's a very promising hunch -- b837e68d explicitly mentions "reboot
> > loop" in the subject. It seems that Dave didn't mention any RHBZ numbers
> > in his email, but we have two somewhat similar bug reports (which I hope
> > share a root cause) and the second report triggers the issue with a
> > reboot loop specifically.
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1411275
> > https://bugzilla.redhat.com/show_bug.cgi?id=1382906
> > 
> > (Apologies that the 2nd RHBZ is not public; it's currently filed for the
> > RH kernel, and those BZs default to private. :/)

That first report mentions migration to a different QEMU version.
When migrating, is the BIOS software migrated as well (the copy at
0xffff0000), or does the new instance get a potentially different
instance of the BIOS?

> > CC'ing DavidH too, for RHBZ#1382906.
> 
> Yeh, it's looking promising; I've done a build with low debug that
> survived for 50+ reboots and turned my debug on and it's going for 20 so far,
> so that's pretty good.
> 
> However, reading the commits I'm a little confused.
> 
> I don't seem to have hit any cases where it's taken the shutdown case after
> failing to reboot; so it's not that path.
> 
> My reboots in this case are always guest triggered, so they're not very
> early reboots.

Both of those seabios fixes are for reboots that occur while
processing a reboot.  Any chance the guest tries multiple reboot
signals and one of them gets delayed?

> 
> One comment in there is:
>    +        // Some old versions of KVM don't store a pristine copy of the
>    +        // BIOS in high memory.  Try to shutdown the machine instead.
> 
> do you have a definition of 'old';  in this case it's a new-ish qemu
> on our downstream (older) kernel but it's got fairly new kvm bits in,
> but the qemu is configured in our rhel6 compatibility mode - so hmm.

I don't have the kvm version handy, but it's really old.  You're
definitely not on that version, or every reboot would result in a
shutdown instead.

-Kevin



More information about the SeaBIOS mailing list