On Tue, Jan 18, 2022 at 9:05 PM Gabe Black <gabe.black@gmail.com> wrote:

Yep, that seems to fix it for me! Thanks!

Gabe

On Tue, Jan 18, 2022 at 1:09 PM Volker Rümelin <vr_qemu@t-online.de> wrote:
> Hi! I'm a major contributor to the gem5 open source computer
> architecture simulator, and I'm trying to get SeaBIOS and FreeDOS to
> run on it. We've had at least some level of x86 support on our
> simulator for a number of years now, but we've primarily focused on 64
> bit mode. I've found a lot of bugs in our simulator as I've been going
> along, but despite my best efforts I haven't been able to find a way
> to blame my code for the bug I'm currently stuck on.
>
> I'm using a very stripped down configuration of SeaBIOS, and am using
> the serial console to interact with it since I haven't written a
> simulated VGA interface yet. It reads FreeDOS from their published
> QEMU disk image, and it starts up and prints a menu where it wants me
> to select from 4 different boot modes. Roughly when I send a character
> over the serial connection, the simulator crashes because the software
> running on it tried to access an address that has nothing behind it.
>
>
> I've been tracing this problem down, and I see an int 0x16 with code
> 0x1 happening, which is trying to check the keyboard status, I
> believe. That goes along, and eventually calls check_irqs, which calls
> need_hop_back which returns 1, and then calls "stack_hop_back" to call
> back into itself but on the "original callers stack" I 75% know what
> that's talking about, but I'm not 100%.
>
> Anyway, once we're on the other stack, we call into the
> clock_poll_irq, that calls clock_update, that calls the (inlined I
> think) sercon_check_event, and when that tries to
> SET_LOW(rx_buf[rx_bytes], byte) the bad access happens. At least I'm
> pretty confident that's where it happens, it could also be in one of
> the other lines right around there.

This is probably a bug in src/sercon.c. rx_bytes is marked VARLOW. The
following code is not tested but I think

SET_LOW(rx_buf[GET_LOW(rx_bytes)], byte);

is correct.

I used objdump -drS -m i8086 rom16.o for the disassembly listing.
     f3d8:       8e c3                   mov    %bx,%es
         if (GET_LOW(rx_bytes) < sizeof(rx_buf)) {
     f3da:       26 8a 16 79 f1          mov    %es:-0xe87,%dl
     f3df:       80 fa 0f                cmp    $0xf,%dl
     f3e2:       77 e6                   ja     f3ca <ExtraStack+0x1f2>
             SET_LOW(rx_buf[GET_LOW(rx_bytes)], byte);
     f3e4:       66 0f b6 ca             movzbl %dl,%ecx
     f3e8:       26 67 88 81 7c f1 00    mov %al,%es:0xf17c(%ecx)
     f3ef:       00
             SET_LOW(rx_bytes, GET_LOW(rx_bytes) + 1);
     f3f0:       66 42                   inc    %edx
     f3f2:       26 88 16 79 f1          mov    %dl,%es:-0xe87
             count++;
     f3f7:       66 46                   inc    %esi
     f3f9:       eb cf                   jmp    f3ca <ExtraStack+0x1f2>

With best regards,
Volker

>
>
> The problem seems to be that the variable it's trying to access is
> supposed to be in the "e" segment, ie with selector 0xe000 and base
> address 0xe0000. The code that does this is here:
>
> fbd43: 8e c3 mov %bx,%es
> fbd45: 26 8a 16 9d f7 mov %es:-0x863,%dl
> fbd4a: 80 fa 0f cmp $0xf,%dl
> fbd4d: 77 e6 ja fbd35 <clock_update+0x6b>
> fbd4f: 66 0f b6 0e 9d f7 movzbl -0x863,%ecx <==== where it
> asplodes
> fbd55: 26 67 88 81 a0 f7 00 mov %al,%es:0xf7a0(%ecx)
>
> Note the comparison against 0xf, which I think is where it checks
> against the size of rx_buf.
>
> You can see here that this access is (I think) using the %ds register
> by default. It has an operand size prefix, and a 2 byte displacement
> of 0xf79d. Adding this to 0xe0000 gives 0xef79d, which from what I've
> seen is a pretty valid looking address, not far below where I have the
> BIOS ROM mapped in.
>
> Unfortunately when this has problems, %ds is actually 0x9d80, which
> gives a base of 0x9d800, which gives a linear address of 0xacf9d. This
> is in the middle of the (not yet implemented) VGA framebuffer which is
> why it dies.
>
>
> I then traced down why %ds has this value, and it's from the "hop
> back" step, specifically here:
>
> asm volatile(
> // Backup stack_pos and current %ss/%esp
> "movl %6, %4\n"
> "movw %%ss, %w3\n"
> "movl %%esp, %6\n"
> // Restore original callers' %ss/%esp
> "movl -4(%4), %5\n"
> "movl %5, %%ss\n" <======== Where %ss is set
> "movw %%ds:-8(%4), %%sp\n"
> "movl %5, %%ds\n" <======== Where %ds is set
> // Call func
>
> Note that in this code, *both* %ss and %ds are being set, and being
> set to the same thing. This value *was* successfully pulled off the
> saved data from when the int was originally called as far as I can
> tell, but this value of %ds does *not* seem to be correct, since the
> first time it's used it causes the bad access.
>
>
> Could you please help me figure out what's going wrong here? Is this
> supposed to work out somehow, and my simulator is just wrong (my bet,
> but what's it doing wrong?), or is this a bug in SeaBIOS? Am I using
> SeaBIOS in some way it's known not to work?
>
> Please let me know if you need any other info, I'll be more than happy
> to get this sorted out!
>
> Gabe
>
> _______________________________________________
> SeaBIOS mailing list -- seabios@seabios.org
> To unsubscribe send an email to seabios-leave@seabios.org