[SeaBIOS] varlow/extrastack vs code

Tue Feb 14 15:50:23 CET 2017

* Dr. David Alan Gilbert (dgilbert at redhat.com) wrote:
> * Kevin O'Connor (kevin at koconnor.net) wrote:
> > On Mon, Jan 23, 2017 at 06:49:07PM +0000, Dr. David Alan Gilbert wrote:
> > > * Laszlo Ersek (lersek at redhat.com) wrote:
> > > > On 01/23/17 16:49, Kevin O'Connor wrote:
> > > > > On Mon, Jan 23, 2017 at 11:11:02AM +0100, Laszlo Ersek wrote:
> > > > >> On 01/20/17 20:39, Dr. David Alan Gilbert wrote:
> > > > >>> * Kevin O'Connor (kevin at koconnor.net) wrote:
> > > > >>>> On Fri, Jan 20, 2017 at 06:40:44PM +0000, Dr. David Alan Gilbert wrote:
> > > > >>>>> Hi,
> > > > >>>>>   I turned the debug level up to 4 on our smaller (128k) ROM downstream
> > > > >>>>> build and seem to have hit a case where it's been layed out so that the
> > > > >>>>> 'ExtraStack' is at the same location as some code (display_uuid) which
> > > > >>>>> was causing some very random behaviour;
> > > > > [...]
> > > > >> Would this be consistent with a stack overflow?
> > > > >>
> > > > >> See commit 46b82624c95b951e8825fab117d9352faeae0ec8. Perhaps
> > > > >> BUILD_EXTRA_STACK_SIZE (2KB) is too small now?
> > > > > 
> > > > > The ExtraStack isn't used at the point Dave reports the problem -
> > > > > display_uuid() is part of the init phase and that happens on the main
> > > > > "post" stack.
> > > > > 
> > > > > [...]
> > > > >> (This is based off 1.9.1)
> > > > > 
> > > > > I missed that earlier - there were some important fixes post 1.9.1 wrt
> > > > > reboots.  Commits b837e68d / a48f602c2 could explain the issue.  I'd
> > > > > make sure the issue is still present on the latest version.
> > > > 
> > > > That's a very promising hunch -- b837e68d explicitly mentions "reboot
> > > > loop" in the subject. It seems that Dave didn't mention any RHBZ numbers
> > > > in his email, but we have two somewhat similar bug reports (which I hope
> > > > share a root cause) and the second report triggers the issue with a
> > > > reboot loop specifically.
> > > > 
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=1411275
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=1382906
> > > > 
> > > > (Apologies that the 2nd RHBZ is not public; it's currently filed for the
> > > > RH kernel, and those BZs default to private. :/)
> > 
> > That first report mentions migration to a different QEMU version.
> > When migrating, is the BIOS software migrated as well (the copy at
> > 0xffff0000), or does the new instance get a potentially different
> > instance of the BIOS?
> 
> This case isn't with migration; as soon as I started debugging that
> I found I could hit it with just a simple reboot loop in the guest.
> (and migration should give you exactly the same ROM image as the
> source because the read-only blocks get migrated and used rather than
> the local copy on the destination - until you power the VM down and
> restart the QEMU when it rereads the local file).
> 
> > > > CC'ing DavidH too, for RHBZ#1382906.
> > > 
> > > Yeh, it's looking promising; I've done a build with low debug that
> > > survived for 50+ reboots and turned my debug on and it's going for 20 so far,
> > > so that's pretty good.
> > > 
> > > However, reading the commits I'm a little confused.
> > > 
> > > I don't seem to have hit any cases where it's taken the shutdown case after
> > > failing to reboot; so it's not that path.
> > > 
> > > My reboots in this case are always guest triggered, so they're not very
> > > early reboots.
> > 
> > Both of those seabios fixes are for reboots that occur while
> > processing a reboot.  Any chance the guest tries multiple reboot
> > signals and one of them gets delayed?
> 
> It's possible; I'll have to add some qemu debug to watch for that;
> the guest I'm using is a rhel 6.9-ish image - so old kernel
> with lots of patches; but I'll try and see what it's doing.
> I've just got a loop that waits for the VM to boot, sees the login/password
> prompt on serial console, and as soon as it gets a shell issues 'reboot'.
> 
> > > One comment in there is:
> > >    +        // Some old versions of KVM don't store a pristine copy of the
> > >    +        // BIOS in high memory.  Try to shutdown the machine instead.
> > > 
> > > do you have a definition of 'old';  in this case it's a new-ish qemu
> > > on our downstream (older) kernel but it's got fairly new kvm bits in,
> > > but the qemu is configured in our rhel6 compatibility mode - so hmm.
> > 
> > I don't have the kvm version handy, but it's really old.  You're
> > definitely not on that version, or every reboot would result in a
> > shutdown instead.
> 
> OK, thanks.

I've been sporadically carrying on debugging this and I
think I have a bit more understanding, but not quite all the way.

I'm pretty sure we are trying to run code that's been over written
by variable data - which I believe to be TimerLast, and I'm also
fairly sure the problem is not a delayed reset from multiple reboot signals.

Some detail:
a) I added some debug to qemu to print what triggered the reboot by
  always passing qemu_system_reset_request a string; the kernel
  is always rebooting via KBD_CCMD_RESET, I see that enter the BIOS
  and then see SeaBIOS then does KBD_CCMD_RESET again - which I think
  is the one coming from handle_resume32.
  But then I noticed that every so often we got a reboot from a KVM_SHUTDOWN
  which is apparently a triple-fault or the like.

b) With a KVM trace I got the addresses the fault was happening at,
  and it was always an address in the copy fo the bios around bffbcf..
  although the actual failure jumped about a bit, taking the:
    Relocating init from 0x000e4810 to 0xbffb2050 (size 57120)
  I worked the fault address back to 'ef799' and from the objdump
  found that was an address in maininit but also corresponded to
  TimerLast:
   000eebdf l     F .text  00001954 maininit
   000ef780 g       *ABS*  00000000 final_varlow_start
   000ef780 g     O *ABS*  00000004 BootSequence
   000ef784 g     O *ABS*  00000001 FloppyDOR
   000ef788 g     O *ABS*  00000008 LastUSBkey
   000ef790 g     O *ABS*  00000001 Ps2ctr
   000ef794 g     O *ABS*  00000004 RTCusers
  >000ef798 g     O *ABS*  00000004 TimerLast
   000ef79c g     O *ABS*  00000001 Century

                             |>    ef78d:       75 07                   jne    ef796 <RTCusers+0x2>
                             |>    ef78f:       e8 fc 3c 00 00          call   f3490 <yield>
                             |>    ef794:       eb e4                   jmp    ef77a <maininit+0xb9b>
               bffbcfd6      |>    ef796:       b0 80                   mov    $0x80,%al
                         *   |>    ef798:       e6 70                   out    %al,$0x70
                         *   |>    ef79a:       e4 71                   in     $0x71,%al
                             |>    ef79c:       88 c2                   mov    %al,%dl
                             |>    ef79e:       b0 82                   mov    $0x82,%al
                             |>    ef7a0:       e6 70                   out    %al,$0x70
                             |>    ef7a2:       e4 71                   in     $0x71,%al

  And so I started dumping out blocks of RAM around there, the only
  bytes I've seen change are the TimerLast bytes (and only 3 of those);
  the contents in the ROM area at fffef7... always stay the same:

  from fffef790:
  + : 0090:  fc 3c 00 00  eb e4 b0 80  e6 70 e4 71  88 c2 b0 82 .<.......p.q....

Here are dumps from the high area it was running from when it crashed:
bffbcf..
                                       vvvvvvvv
  : : 00d0:  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00 ................
  : : 00d0:  bc 64 13 40  eb e4 b0 80  e6 70 e4 71  88 c2 b0 82 .d. at .....p.q....
  : : 00d0:  bc 64 13 40  eb e4 b0 80  e6 70 e4 71  88 c2 b0 82 .d. at .....p.q....
  : : 00d0:  bc 64 13 40  eb e4 b0 80  04 00 e6 71  88 c2 b0 82 .d. at .......q....
  : : 00d0:  bc 64 13 40  eb e4 b0 80  04 00 e6 71  88 c2 b0 82 .d. at .......q....
  : : 00d0:  bc 64 13 40  eb e4 b0 80  00 fa e5 71  88 c2 b0 82 .d. at .......q....
  : : 00d0:  bc 64 13 40  eb e4 b0 80  00 fa e5 71  88 c2 b0 82 .d. at .......q....
  : : 00d0:  bc 64 13 40  eb e4 b0 80  da d4 e5 71  88 c2 b0 82 .d. at .......q....
  : : 00d0:  bc 64 13 40  eb e4 b0 80  da d4 e5 71  88 c2 b0 82 .d. at .......q....
  : : 00d0:  bc 64 13 40  eb e4 b0 80  54 50 e6 71  88 c2 b0 82 .d. at ....TP.q....
  : : 00d0:  bc 64 13 40  eb e4 b0 80  54 50 e6 71  88 c2 b0 82 .d. at ....TP.q....

  and those 3 bytes correspond to TimerLast and they're changing all over;

I also dumped memory around ef790, and it's also seeing the change:
e00000..
  > : f790:  fc 3c 00 00  eb e4 b0 80  e6 70 e4 71  88 c2 b0 82 .<.......p.q....
  > : f790:  61 66 90 90  00 00 00 00  3f 3d f9 00  20 66 90 90 af......?=.. f..
  > : f790:  fc 3c 00 00  eb e4 b0 80  04 00 e6 71  88 c2 b0 82 .<.........q....
  > : f790:  61 66 90 90  00 00 00 00  54 ee f0 00  20 66 90 90 af......T... f..
  > : f790:  fc 3c 00 00  eb e4 b0 80  00 fa e5 71  88 c2 b0 82 .<.........q....
  > : f790:  61 66 90 90  00 00 00 00  3f ca af 01  20 66 90 90 af......?... f..
  > : f790:  fc 3c 00 00  eb e4 b0 80  da d4 e5 71  88 c2 b0 82 .<.........q....
  > : f790:  61 66 90 90  00 00 00 00  c4 d8 a8 01  20 66 90 90 af.......... f..
  > : f790:  fc 3c 00 00  eb e4 b0 80  54 50 e6 71  88 c2 b0 82 .<......TP.q....
  > : f790:  61 66 90 90  00 00 00 00  d5 3f 67 01  20 66 90 90 af.......?g. f..
  > : f790:  fc 3c 00 00  eb e4 b0 80  b8 b2 e5 71  88 c2 b0 82 .<.........q....
  > : f790:  30 66 90 90  00 00 00 00  00 00 00 00  00 66 90 90 0f...........f..
  > : f790:  fc 3c 00 00  eb e4 b0 80  be b8 e5 71  88 c2 b0 82 .<.........q....

  but I'm assuming that's actually the real variable data some of the time and sometimes
the ROM copy (this was dumped with cpu_memory_rw_debug in qemu, which says it's
physical addresses).

I've not quite got my head around the ROM/writeable swith for the copy
around e00000.
So is this one of the timer routines being called before the code is copied
to bffb.... ?

Still, this does explain why it's such a heisenbug;  from a build point of
view it requires TimerLast to overlay some critical bit of code you run
each time (which probably only happens on 128k builds), then
it's pot luck what happens based on timer value; a lot of the time it's
nothing bad and you don't notice; sometimes you triple fault and
the reset just takes a bit longer since it does an extra reset or two,
and if you get really unlucky you hang or get a KVM error.

Dave

> Dave
> 
> > -Kevin
> --
> Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK
> 
> _______________________________________________
> SeaBIOS mailing list
> SeaBIOS at seabios.org
> https://www.coreboot.org/mailman/listinfo/seabios
--
Dr. David Alan Gilbert / dgilbert at redhat.com / Manchester, UK