Re: [OpenBIOS] More work on Solaris 8 SPARC32 crash

14 Feb 2011

      On 13/02/11 22:17, Tarl Neustaedter wrote:
Hi Tarl,
...
...
Incidentally if I also enable romvec debugging in OpenBIOS this is
what I get on the console just before the crash:
vac: enabled in write through mode
mem = 131072K (0x8000000)
avail mem = 110419968
obp_nextnode(0x0) = 0xffd4527c
obp_proplen(0xffd4527c, reg) (not found)
obp_proplen(0xffd4527c, ranges) (not found)
obp_proplen(0xffd4527c, intr) (not found)
obp_proplen(0xffd4527c, interrupts) (not found)
That's not good. obp_nextnode() should be giving you a pointer to a
valid node (I believe root), where it looks at properties.
Yes, that is actually what is happening in the trace above - 
obp_nextnode(0x0) means 0 is being passed in, and then 0xffd4527c is 
being returned as the handle.
...
The divide by zero is probably Solaris signalling an error; if things
are bad enough that it can't talk with the PROM (or doesn't trust it),
it does a divide by zero to blow up. In
usr/src/psm/promif/ieee1275/sun4/prom_init.c :
/*

Fatal promif internal error, not an external interface

*/
/*ARGSUSED*/
void
prom_fatal_error(const char *errormsg)
{
volatile int zero = 0;
volatile int i = 1;
/*

No prom interface, try to cause a trap by
dividing by zero, leaving the message in %i0.

*/
i = i / zero;
/*NOTREACHED*/
I don't think this has anything to do with the PIL 14 or 10 issues you
discuss later on.
Oh that's interesting. However I don't think that this is the case here 
for 2 reasons:
1) The backtraces definitely point to an issue with clock initialisation 
based upon the symbol names, and enabling the L14 timer does allow the 
division by zero to succeed with a value between 0 and the counter limit.
2) The address where the trap is invoked is definitely outside the main 
kernel space by some margin, which makes me think that this is because 
it is coming from an external kernel module which is being dynamically 
loaded - otherwise if it were being caused by the above, I would expect 
the trap address to be within the main kernel image.
ATB,
Mark.
-- 
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs