On 02/01/11 09:53, Blue Swirl wrote:
Based upon this, it would seem that the Solaris kernel allocates a stack for saving state when a trap is called with a base of 0xf0240000, but for some reason we are stacking to a point where we go beyond the memory region allocated for it. I suspect that this is a side effect of a property/device not being setup correctly, but I'm not yet sure what it is. Anyhow, I thought I'd post the results of my investigations so far in case anyone else has any ideas as to what could cause this.
The kernel stack is overflown. Perhaps some recursive loop (iterating device tree, since this doesn't happen on real hardware?) never exits, or maybe OpenBIOS consumes kernel stack much more than OBP.
Yeah, that's the conclusion I came to although I'm not really familiar with the overall Solaris boot process to figure out what should happen as opposed to what does happen.
Idly adding a few breakpoints in obp_fortheval_v2(), a couple of interesting things stand out:
Breakpoint 1, obp_fortheval_v2 (str=0x11a268 "h# f800.0000 rmap@ swap ! ", arg0=1247220, arg1=0, arg2=0, arg3=0, arg4=0) at ../arch/sparc32/romvec.c:428
Firstly, we don't do full region/segment mapping in OpenBIOS (we only have 1 level PTEs) but I'm not sure this is relevant here.
Secondly, the majority of the calls to obp_devopen() pass an argument string like this "/iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@2,0:d". However, towards the end it changes over to this: "/iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@2,0:b". Maybe the kernel can't find something in the d slice and so switches to the b slice as a backup, at which point it runs out of stack space as this isn't supposed to happen?
ATB,
Mark.