Hello -
First time poster, so take it easy on me :)
This is a great project -- I was able to get a kvm+coreboot+SeaBIOS environment going pretty easily. I started with the master branch of coreboot and went from there.
I am having a problem trying to load a Linux kernel+initramfs payload from SeaBIOS.
I can successfully boot the same kernel+initramfs straight from coreboot (without SeaBIOS) as a payload. Also I can boot the same kernel+initramfs from the hard disk using SeaBIOS when GRUB is installed on the hard disk. So I'm pretty sure my kernel+initramfs is OK.
My rom looks like this:
coreboot.rom: 16384 kB, bootblocksize 1416, romsize 16777216, offset 0x0 alignment: 64 bytes
Name Offset Type Size cmos_layout.bin 0x0 cmos_layout 1160 fallback/romstage 0x4c0 stage 19569 fallback/ramstage 0x5180 stage 55849 config 0x12c00 raw 3132 fallback/payload 0x13880 payload 57928 vgaroms/sgabios.bin 0x21b00 raw 4096 etc/boot-menu-wait 0x22b40 raw 8 etc/boot-menu-key 0x22b80 raw 8 etc/boot-menu-message 0x22bc0 raw 34 etc/screen-and-debug 0x22c40 raw 8 img/LINUX 0x22c80 payload 8474826 (empty) 0x837d80 null 8158360
When I break into SeaBIOS during the boot I get this menu:
Select boot device:
1. DVD/CD [ata1-0: QEMU DVD-ROM ATAPI-4 DVD/CD] 2. Virtio disk PCI:0:4 3. Legacy option rom 4. iPXE (PCI 00:03.0) 5. Payload [LINUX]
When I select "5" I get this:
Searching bootorder for: HALT drive 0x000f4d50: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 s=33554432 Running option rom at cc80:0003 Booting from CBFS... Run img/LINUX Calling addr 0x00040000 ... [System Reboots]
I turned on some debug in coreboot and SeaBIOS and compared the successful boot (coreboot+Linux) to the unsuccessful boot (coreboot+SeaBIOS+Linux). I compared the segment offsets and sizes between the two cases and they were the same. Also the 0x00040000 entry point was the same.
I am mostly stuck now. Is this even a supported boot option? I'm new to SeaBIOS, so any help is much appreciated.
Thanks again for a great project -- I had pretty smooth sailing up to this point.
Cheers, Curt
On Fri, Aug 15, 2014 at 03:38:30PM -0700, Curt Brune wrote:
Hello -
First time poster, so take it easy on me :)
This is a great project -- I was able to get a kvm+coreboot+SeaBIOS environment going pretty easily. I started with the master branch of coreboot and went from there.
I am having a problem trying to load a Linux kernel+initramfs payload from SeaBIOS.
I can successfully boot the same kernel+initramfs straight from coreboot (without SeaBIOS) as a payload. Also I can boot the same kernel+initramfs from the hard disk using SeaBIOS when GRUB is installed on the hard disk. So I'm pretty sure my kernel+initramfs is OK.
My rom looks like this:
coreboot.rom: 16384 kB, bootblocksize 1416, romsize 16777216, offset 0x0 alignment: 64 bytes
Name Offset Type Size cmos_layout.bin 0x0 cmos_layout 1160 fallback/romstage 0x4c0 stage 19569 fallback/ramstage 0x5180 stage 55849 config 0x12c00 raw 3132 fallback/payload 0x13880 payload 57928 vgaroms/sgabios.bin 0x21b00 raw 4096 etc/boot-menu-wait 0x22b40 raw 8 etc/boot-menu-key 0x22b80 raw 8 etc/boot-menu-message 0x22bc0 raw 34 etc/screen-and-debug 0x22c40 raw 8 img/LINUX 0x22c80 payload 8474826 (empty) 0x837d80 null 8158360
When I break into SeaBIOS during the boot I get this menu:
Select boot device:
- DVD/CD [ata1-0: QEMU DVD-ROM ATAPI-4 DVD/CD]
- Virtio disk PCI:0:4
- Legacy option rom
- iPXE (PCI 00:03.0)
- Payload [LINUX]
When I select "5" I get this:
Searching bootorder for: HALT drive 0x000f4d50: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 s=33554432 Running option rom at cc80:0003 Booting from CBFS... Run img/LINUX Calling addr 0x00040000 ... [System Reboots]
I turned on some debug in coreboot and SeaBIOS and compared the successful boot (coreboot+Linux) to the unsuccessful boot (coreboot+SeaBIOS+Linux). I compared the segment offsets and sizes between the two cases and they were the same. Also the 0x00040000 entry point was the same.
Nothing from your description sounds like it should cause a problem. I recommend increasing the SeaBIOS debug level to 8 and posting the full log.
-Kevin
Hello Kevin -
On Sat Aug 16 09:51, Kevin O'Connor wrote:
Nothing from your description sounds like it should cause a problem. I recommend increasing the SeaBIOS debug level to 8 and posting the full log.
Thanks for your help and consideration.
The full log is attached.
Interactively I pressed the <ESC> key to enter the SeaBIOS menu and then I selected boot device 5 "Payload [LINUX]". After that the system reboots.
Cheers, Curt
On Mon, Aug 18, 2014 at 08:51:50AM -0700, Curt Brune wrote:
Hello Kevin -
On Sat Aug 16 09:51, Kevin O'Connor wrote:
Nothing from your description sounds like it should cause a problem. I recommend increasing the SeaBIOS debug level to 8 and posting the full log.
Thanks for your help and consideration.
The full log is attached.
Interactively I pressed the <ESC> key to enter the SeaBIOS menu and then I selected boot device 5 "Payload [LINUX]". After that the system reboots.
Hi Curt,
Unfortunately, I don't see anything in the log that looks suspicious. Since you are using qemu/kvm, you could try attaching a debugger (if you are familiar with gdb).
There's documentation online and in the SeaBIOS' README file. In a nutshell, run qemu with "-s -S" and then run "gdb" in another terminal and send it "target remote localhost:1234" along with something like "br *0x40000" to set a break point at the start of the payload.
Another option would be to try with a known working payload and verify if you can get SeaBIOS to launch it.
-Kevin
Hello Kevin -
On Mon Aug 18 14:56, Kevin O'Connor wrote:
Unfortunately, I don't see anything in the log that looks suspicious. Since you are using qemu/kvm, you could try attaching a debugger (if you are familiar with gdb).
There's documentation online and in the SeaBIOS' README file. In a nutshell, run qemu with "-s -S" and then run "gdb" in another terminal and send it "target remote localhost:1234" along with something like "br *0x40000" to set a break point at the start of the payload.
Another option would be to try with a known working payload and verify if you can get SeaBIOS to launch it.
Thanks for your time and suggestions.
As a control I tried using the libpayload example (http://www.coreboot.org/Libpayload) as a SeaBIOS payload and that worked fine. So SeaBIOS seems fine.
I went down the gdb route and found some interesting results, but still not sure what is wrong. It has been educational :)
Using gdb I worked my way through coreboot and SeaBIOS loading and into cbfs_run_payload(). cbfs_run_payload() loads the payload OK and jumps to the trampoline @ 0x40000 fine. The trampoline does various things and eventually jumps to the kernel code @ 0x1000000. In gdb this looks like:
=> 0x1000000: cld 0x1000001: test BYTE PTR [esi+0x211],0x40 0x1000008: jne 0x1000016 0x100000a: cli 0x100000b: mov eax,0x18 0x1000010: mov ds,eax 0x1000012: mov es,eax 0x1000014: mov ss,eax <--- CPU resets executing this 0x1000016: lea esp,[esi+0x1e8] 0x100001c: call 0x1000021
The source code for this location comes from the kernel file linux/arch/x86/boot/compressed/head_64.S
I found executing the "mov ss,eax" instruction @ 0x1000014 would cause the virtual machine to reboot. I presume some kind of stack/memory exception happened and the exception handler resets the CPU.
Just before executing that instruction the CPU registers look like this (from the qemu monitor):
(qemu) info registers EAX=00000018 EBX=1ffd6610 ECX=00000000 EDX=00000000 ESI=00090000 EDI=00090334 EBP=00040070 ESP=00006fa4 EIP=01000014 EFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0018 000f0000 0000ffff 00009b00 DPL=0 CS16 [-RA] CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0018 000f0000 0000ffff 00009b00 DPL=0 CS16 [-RA] FS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] GS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy GDT= 000f4b78 00000037 IDT= 000f4bb6 00000000 CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
So setting SS from 0x10 to 0x18 caused the exception. Of note the DS and ES registers were set from 0x10 to 0x18 without any problem.
Now armed with qemu/gdb knowledge I went through my working case, where my kernel is loaded directly by coreboot. SeaBIOS is not used in this case. My kernel boots fine in this case.
Just before executing the troublesome instruction @ 0x1000014 the CPU registers look like this (from the qemu monitor) in the working case:
(qemu) info registers EAX=00000018 EBX=1ffd6610 ECX=00000000 EDX=00000000 ESI=00090000 EDI=00090334 EBP=00127ff0 ESP=1ffd0f88 EIP=01000014 EFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] FS =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] GS =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy GDT= 1ffec000 00000047 IDT= 001218c4 0000009f CR0=60000011 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
Here the interesting thing is that DS, ES and SS *already* contained 0x18. In the SeaBIOS case those registers contained 0x10 before the Linux kernel set them up.
That's all I got. Any clues?
Cheers, Curt
On 08/27/14 23:17, Curt Brune wrote:
Hello Kevin -
On Mon Aug 18 14:56, Kevin O'Connor wrote:
Unfortunately, I don't see anything in the log that looks suspicious. Since you are using qemu/kvm, you could try attaching a debugger (if you are familiar with gdb).
There's documentation online and in the SeaBIOS' README file. In a nutshell, run qemu with "-s -S" and then run "gdb" in another terminal and send it "target remote localhost:1234" along with something like "br *0x40000" to set a break point at the start of the payload.
Another option would be to try with a known working payload and verify if you can get SeaBIOS to launch it.
Thanks for your time and suggestions.
As a control I tried using the libpayload example (http://www.coreboot.org/Libpayload) as a SeaBIOS payload and that worked fine. So SeaBIOS seems fine.
I went down the gdb route and found some interesting results, but still not sure what is wrong. It has been educational :)
Using gdb I worked my way through coreboot and SeaBIOS loading and into cbfs_run_payload(). cbfs_run_payload() loads the payload OK and jumps to the trampoline @ 0x40000 fine. The trampoline does various things and eventually jumps to the kernel code @ 0x1000000. In gdb this looks like:
=> 0x1000000: cld 0x1000001: test BYTE PTR [esi+0x211],0x40 0x1000008: jne 0x1000016 0x100000a: cli 0x100000b: mov eax,0x18 0x1000010: mov ds,eax 0x1000012: mov es,eax 0x1000014: mov ss,eax <--- CPU resets executing this 0x1000016: lea esp,[esi+0x1e8] 0x100001c: call 0x1000021
The source code for this location comes from the kernel file linux/arch/x86/boot/compressed/head_64.S
I found executing the "mov ss,eax" instruction @ 0x1000014 would cause the virtual machine to reboot. I presume some kind of stack/memory exception happened and the exception handler resets the CPU.
Just before executing that instruction the CPU registers look like this (from the qemu monitor):
(qemu) info registers EAX=00000018 EBX=1ffd6610 ECX=00000000 EDX=00000000 ESI=00090000 EDI=00090334 EBP=00040070 ESP=00006fa4 EIP=01000014 EFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0018 000f0000 0000ffff 00009b00 DPL=0 CS16 [-RA] CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0018 000f0000 0000ffff 00009b00 DPL=0 CS16 [-RA] FS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] GS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy GDT= 000f4b78 00000037 IDT= 000f4bb6 00000000 CR0=00000011 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
So setting SS from 0x10 to 0x18 caused the exception. Of note the DS and ES registers were set from 0x10 to 0x18 without any problem.
Now armed with qemu/gdb knowledge I went through my working case, where my kernel is loaded directly by coreboot. SeaBIOS is not used in this case. My kernel boots fine in this case.
Just before executing the troublesome instruction @ 0x1000014 the CPU registers look like this (from the qemu monitor) in the working case:
(qemu) info registers EAX=00000018 EBX=1ffd6610 ECX=00000000 EDX=00000000 ESI=00090000 EDI=00090334 EBP=00127ff0 ESP=1ffd0f88 EIP=01000014 EFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] FS =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] GS =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy GDT= 1ffec000 00000047 IDT= 001218c4 0000009f CR0=60000011 CR2=00000000 CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
Here the interesting thing is that DS, ES and SS *already* contained 0x18. In the SeaBIOS case those registers contained 0x10 before the Linux kernel set them up.
That's all I got. Any clues?
I'll probably make a clown out of myself, but I'll just blurt out that the direct value of SS (0x18) is quite meaningless, it's just a segment *selector* value. What is interesting is the contents of the GDT entry, at byte offset 0x18 (which corresponds to the fourth GDT entry).
In your first register dump, you have
GDT= 000f4b78 00000037
in the 2nd register dump, you have
GDT= 1ffec000 00000047
The second value in each case means the maximum (inclusive) byte offset in the GDT, so 0x18 itself should be okay in both cases.
Can you hex-dump the guest memory too, in both cases, starting from 0xf4b78+0x18, and from 0x1ffec000+0x18, respectively, for 0x8 bytes? This would provide the segment descriptor in each case that the selector value 0x18 references.
In the resetting case, the segment descriptor referenced by selector value 0x18 is probably suitable for data segments, but inappropriate for the stack segment.
Thanks Laszlo /ducks
On Thu, Aug 28, 2014 at 01:12:48AM +0200, Laszlo Ersek wrote:
On 08/27/14 23:17, Curt Brune wrote:
GDT= 000f4b78 00000037
[...]
I'll probably make a clown out of myself, but I'll just blurt out that the direct value of SS (0x18) is quite meaningless, it's just a segment *selector* value. What is interesting is the contents of the GDT entry, at byte offset 0x18 (which corresponds to the fourth GDT entry).
In your first register dump, you have
GDT= 000f4b78 00000037
Oh, I missed that the GDT address was actually in the register dump. It's definitely wrong for that code to be using the SeaBIOS' gdt. So, something neglected to call "lgdt". Only question is whether it's the bootstrap or Linux itself that was supposed to do it.
-Kevin
Laszlo Ersek [mailto:lersek@redhat.com] wrote:
[snip]
]Can you hex-dump the guest memory too, in both cases, starting from ]0xf4b78+0x18, and from 0x1ffec000+0x18, respectively, for 0x8 bytes? ]This would provide the segment descriptor in each case that the selector ]value 0x18 references.
I thought of the same, but then saw that gdb is actually dumping those:
0018 000f0000 0000ffff 00009b00 DPL=0 CS16 (causes reset) 0018 00000000 ffffffff 00c09300 DPL=0 DS (works)
]In the resetting case, the segment descriptor referenced by selector ]value 0x18 is probably suitable for data segments, but inappropriate for ]the stack segment.
Exactly. In the passing case the selector is for a r/w data segment. In the failing case, the selector is for an execute/read code segment. While a code segment selector is valid for some segments, the stack segment has some special checks:
IF SS is loaded THEN IF segment selector is NULL THEN #GP(0); FI; IF segment selector index is outside descriptor table limits or segment selector's RPL ≠ CPL or segment is not a writable data segment<========= or DPL ≠ CPL THEN #GP(selector); FI;
... So a GP fault. Weird that the load of the new GDT is getting skipped, but stranger things have happened. Thanks, Scott
]Thanks ]Laszlo ]/ducks
Kevin, Laszlo, Scott -
On Wed Aug 27 21:41, Scott Duplichan wrote:
Laszlo Ersek [mailto:lersek@redhat.com] wrote:
[snip]
]Can you hex-dump the guest memory too, in both cases, starting from ]0xf4b78+0x18, and from 0x1ffec000+0x18, respectively, for 0x8 bytes? ]This would provide the segment descriptor in each case that the selector ]value 0x18 references.
I thought of the same, but then saw that gdb is actually dumping those:
0018 000f0000 0000ffff 00009b00 DPL=0 CS16 (causes reset) 0018 00000000 ffffffff 00c09300 DPL=0 DS (works)
Thanks all for your help, insight and education. It definitely is the segment descriptor referenced by selector 0x18 that is the problem. I will raise the issue on the coreboot list, but it looks to me that the cbfstool and the Linux trampoline it provides is not quite right.
The Linux kernel is pretty clear about its expectations about the segments from selector 0x18 -- from linux/Documentation/x86/boot.txt:
At entry, [...] a GDT must be loaded with the descriptors for selectors __BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat segment; __BOOT_CS must have execute/read permission, and __BOOT_DS must have read/write permission; CS must be __BOOT_CS and DS, ES, SS must be __BOOT_DS;
The ramstage of coreboot just happens to setup the descriptor at selector 0x18 just so. The trampoline provided by the cbfstool for Linux payloads does not adjust the descriptor at selector 0x18.
From gdb I was able to poke in a valid descriptor for selector 0x18
and off it went without a hitch.
Thanks again.
Cheers, Curt
On Wed, Aug 27, 2014 at 02:17:55PM -0700, Curt Brune wrote:
Just before executing that instruction the CPU registers look like this (from the qemu monitor):
(qemu) info registers EAX=00000018 EBX=1ffd6610 ECX=00000000 EDX=00000000 ESI=00090000 EDI=00090334 EBP=00040070 ESP=00006fa4 EIP=01000014 EFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0018 000f0000 0000ffff 00009b00 DPL=0 CS16 [-RA] CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0018 000f0000 0000ffff 00009b00 DPL=0 CS16 [-RA] FS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] GS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
Those segment descriptors (see CS16, CS32 above) look like they're using the SeaBIOS gdt. So, it looks like something is trying to change segments without having properly invoked the "lgdt" instruction.
[...]
Just before executing the troublesome instruction @ 0x1000014 the CPU registers look like this (from the qemu monitor) in the working case:
(qemu) info registers EAX=00000018 EBX=1ffd6610 ECX=00000000 EDX=00000000 ESI=00090000 EDI=00090334 EBP=00127ff0 ESP=1ffd0f88 EIP=01000014 EFL=00000046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] CS =0010 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] SS =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] FS =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] GS =0018 00000000 ffffffff 00c09300 DPL=0 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
In the above you see "DS" and "CS32" describing those descriptors. So, either a "lgdt" instruction was issued in this case, or a gdt was already present that just happened to be compatible with the above code.
Here the interesting thing is that DS, ES and SS *already* contained 0x18. In the SeaBIOS case those registers contained 0x10 before the Linux kernel set them up.
That's all I got. Any clues?
Sounds like either Linux, or whatever tool you used to generate the linux payload is at fault. Armed with the above, I'd ask on the coreboot mailing list.
-Kevin