[OpenBIOS] Analysis of current Solaris 8 boot failure on SPARC32
mark.cave-ayland at siriusit.co.uk
Mon Jan 3 14:51:27 CET 2011
On 02/01/11 11:14, Mark Cave-Ayland wrote:
>> The kernel stack is overflown. Perhaps some recursive loop (iterating
>> device tree, since this doesn't happen on real hardware?) never exits,
>> or maybe OpenBIOS consumes kernel stack much more than OBP.
> Yeah, that's the conclusion I came to although I'm not really familiar
> with the overall Solaris boot process to figure out what should happen
> as opposed to what does happen.
After some fiddling with the Solaris ISO, I extracted out the SS-5
kernel and loaded the symbols from that into gdb and tried to step
through various bits to see what is happening.
Stepping through the code manually, it looks like we're going through
the following function names:
Moving futher, it's a little difficult to tell but it looks as if we're
dying in some kind of MMU setup here:
#0 0xf007057c in page_numtopp_nolock ()
#1 0xf005517c in load_l3 ()
#2 0xf0054e7c in load_l2 ()
#3 0xf024429c in rootfs ()
#4 0xf024429c in rootfs ()
I added a breakpoint at 0xf02442a0 and that wasn't reached before the
fatal trap fired. Taking a look at these routines in the OpenSolaris
source, it looks like fix_prom_pages() does some interesting things with
memory lists to work out which parts of memory are already mapped, and
so my current suspicion is that the memory lists are somehow wrong.
Does anyone know whether Solaris 8 uses the romvec v0 memlist arrays or
whether it uses the relevant properties read directly from the
/virtual-memory and /memory nodes?
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
t: +44 870 608 0063
Sirius Labs: http://www.siriusit.co.uk/labs
More information about the OpenBIOS