[OpenBIOS] Sun OBP bugs in 1.0RC1
pjcreath+openbios at gmail.com
Thu Feb 15 23:34:00 CET 2007
I've been trying to get old versions of SunOS to load under qemu. In
doing so, I've encountered a number of bugs in OBP. I'm not always
certain of the best fix, but I can at least provide a quick hack that
will get people farther along.
1) Error message: "kmem_alloc failed, nbytes 680"
Bug: obp_dumb_memalloc is a bit too dumb. It needs to pick an address
if passed a null address. (According to the comment in the allocator
in OpenSolaris prom_alloc.c (see
"If virthint is zero, a suitable virt is chosen.")
Quick fix: If passed a null address, start doling out addresses at
10MB and increment by size.
Shortcomings: The quick fix ignores the issue of free() and doesn't
remove memory from the virtual-memory/available node.
After the quick fix, the boot gets farther, leading us to:
2) Error message: "Unhandled Exception 0x00000080"
Bug: Trap 0 (entry 0x80 in the table, i.e. syscall_trap_4x) is
undefined. This is because the SunOS bootloader installs the trap by
writing code in the trap table, but the trap table is in the .text
section of OpenBIOS. Thus the trap 0 handler simply jumps to "bug".
Quick fix: Move the trap table to the .data section. Insert a "b
entry; nop; nop; nop;" before "bug:".
Shortcomings: Requires the extra "b entry" code. Allows the only VM
copy of the trap table to be permanently changed. OpenBIOS should
copy the read-only trap table to read-write memory (and update %tbr)
3) #2 above actually exposes another bug. The write to the read-only
trap table does not cause an access violation -- instead, it silently
fails. The "std" instruction at 0x403e6c in the bootloader has no
Bug: Uncertain. It could be a systemic bug in qemu, but it appears
that the VM's MMU believes that the page is writable. That means that
the VM's MMU is not having the access protection flags set for pages
mapped to ROM. It thinks everything is rwx.
Fix?: The VM's MMU should have the access protection flags properly
set for each ROM section. This should probably be done within
OpenBIOS. E.g., .text should be r-x, .data should probably be rwx,
This is the one fix I'm really not sure how to implement. Any
suggestions? This may be a problem that only affects this bootloader,
so fixing #2 above may be all that's strictly necessary. But I'm not
positive that this bug doesn't have other ill effects I haven't found
At any rate, fixing #2 gets us still further, to:
4) Error messages:
"obp_devopen(sd(0,0,0):d) = 0xffd8e270
obp_inst2pkg(fd 0xffd8e270) = 0xffd57f44
obp_getprop(0xffd57f44, device_type) (not found)"
Bug: The OpenBIOS "interpose" implementation is not transparent to
non-interposition-aware code (in violation of the interposition spec).
The inst2pkg call in this sequence returns the phandle for
/packages/misc-files, instead of the proper phandle.
Quick fix: Comment out the "interpose disk-label" lines in ob_sd_open.
Shortcomings: It disables disk-label. The correct fix is to fix the
underlying problem with interposition, but I'm not sure exactly what
it is. Could someone help?
Fixing #4 gets us quite a bit further, until:
5) Error message:
"Unhandled Exception 0x00000009
PC = 0xf0138b20 NPC = 0xf0138b24
Bug: The instruction is trying to read from 0xfd020000+4, which is an
invalid address. This address isn't mapped by OBP by default on Sun
hardware, so the bootloader must be trying to (a) map this address and
failing silently or (b) skipping the mapping for some reason. The
instruction is hard-coded to look at this absolute address.
Fix: Unknown. This may be another instance of writes silently
failing, hence my interest in #3 above. It could also be a
side-effect of the quick fix for #4.
I'm happy to work further on these fixes and put them into patch form.
Could someone point me to how I'd do that?
More information about the OpenBIOS