Hi all,
With the previous commit in place, we now get here trying to boot my Solaris 8 ISO image under OpenBIOS:
Configuration device id QEMU version 1 machine id 32 CPUs: 1 x FMI,MB86904 UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.0 built on Apr 5 2011 08:40 Type 'help' for detailed information
0 > boot cdrom:d -v Not a bootable ELF image Loading a.out image... Loaded 7680 bytes entry point is 0x4000 bootpath: /iommu/sbus/espdma/esp/sd@2,0:d
Jumping to entry point 00004000 for type 00000005... switching to new context: Size: 259040+54154+47486 Bytes SunOS Release 5.8 Version Generic_108528-09 32-bit Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved. Ethernet address = 52:54:0:12:34:56 Using default device instance data vac: enabled in write through mode mem = 131072K (0x8000000) avail mem = 110419968 root nexus = SUNW,SPARCstation-5 iommu0 at root: obio 0x10000000 sbus0 at iommu0: obio 0x10001000 dma0 at sbus0: SBus slot 5 0x8400000 dma0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000 /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 sd2 at esp0: target 2 lun 0 sd2 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@2,0 root on /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd@2,0:b fstype ufs obio0 at root obio0 at obio0: obio 0x100000, sparc ipl 12 zs0 is /obio/zs@0,100000 obio1 at obio0: obio 0x0, sparc ipl 12 zs1 is /obio/zs@0,0 (here we freeze)
According to Artyom's OBP output, the next line displayed should identify the CPU but we don't ever seem to get there. Looking at various symbols within the kernel, it appears as if we spend a short time in some kind of CPU speed detection routine (a loop of several seconds with lots of get_hrtime* type calls) before bailing out. Alas at the moment, kadb doesn't work which means since these calls are coming from a module, I don't have much in the symbol names to work with.
Talking of kadb, if I try and invoke it at the moment, I get this:
Configuration device id QEMU version 1 machine id 32 CPUs: 1 x FMI,MB86904 UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.0 built on Apr 5 2011 08:40 Type 'help' for detailed information
0 > boot cdrom:d kadb -kdv Not a bootable ELF image Loading a.out image... Loaded 7680 bytes entry point is 0x4000 bootpath: /iommu/sbus/espdma/esp/sd@2,0:d
Jumping to entry point 00004000 for type 00000005... switching to new context: Size: 119204+222573+28987 Bytes kadb: kadb: kernel/unix Size: 259040+54154+47486 Bytes /platform/SUNW,SPARCstation-5/kernel/unix loaded - 0x95000 bytes used kadb[0]: :c Begin traceback... sp = 12f9e0 Called from fbd01700, fp=12fa50, args=9 f0086444 f0086438 0 f0258f2c 0 Called from f00859e0, fp=12fad8, args=f0258fc0 44 12fbf8 0 f0258f2c f0258f70 Called from f0082518, fp=12fb38, args=f0258fc0 0 44 0 0 fbd039e4 Called from f0081c48, fp=12fb98, args=12fbf8 f0258de8 12fbf8 fbd58800 f0040000 f0240000 Called from f00819c0, fp=12fff8, args=130168 f0258f70 f0258ed8 f0258ed0 2 130168 Called from f007ff0c, fp=130058, args=f0258c00 f0258c00 fbd54860 130168 f0258d08 f End traceback... fault and calling cmd: trap 9 sp 12f9e0 pc f0086444 npc f0086438 stopped at: kadb[0]:
I think trap #9 is "data access error" but it only occurs when booting the kernel with kadb. Looks like this will need a bit more investigation.
ATB,
Mark.
On 08/04/11 11:08, Mark Cave-Ayland wrote:
Talking of kadb, if I try and invoke it at the moment, I get this:
Configuration device id QEMU version 1 machine id 32 CPUs: 1 x FMI,MB86904 UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.0 built on Apr 5 2011 08:40 Type 'help' for detailed information
0 > boot cdrom:d kadb -kdv Not a bootable ELF image Loading a.out image... Loaded 7680 bytes entry point is 0x4000 bootpath: /iommu/sbus/espdma/esp/sd@2,0:d
Jumping to entry point 00004000 for type 00000005... switching to new context: Size: 119204+222573+28987 Bytes kadb: kadb: kernel/unix Size: 259040+54154+47486 Bytes /platform/SUNW,SPARCstation-5/kernel/unix loaded - 0x95000 bytes used kadb[0]: :c Begin traceback... sp = 12f9e0 Called from fbd01700, fp=12fa50, args=9 f0086444 f0086438 0 f0258f2c 0 Called from f00859e0, fp=12fad8, args=f0258fc0 44 12fbf8 0 f0258f2c f0258f70 Called from f0082518, fp=12fb38, args=f0258fc0 0 44 0 0 fbd039e4 Called from f0081c48, fp=12fb98, args=12fbf8 f0258de8 12fbf8 fbd58800 f0040000 f0240000 Called from f00819c0, fp=12fff8, args=130168 f0258f70 f0258ed8 f0258ed0 2 130168 Called from f007ff0c, fp=130058, args=f0258c00 f0258c00 fbd54860 130168 f0258d08 f End traceback... fault and calling cmd: trap 9 sp 12f9e0 pc f0086444 npc f0086438 stopped at: kadb[0]:
I think trap #9 is "data access error" but it only occurs when booting the kernel with kadb. Looks like this will need a bit more investigation.
As an aside, I managed to spend a few hours over the weekend trying to work out why kadb doesn't want to start the kernel. Fortunately enough, kadb is non-stripped which always makes debugging these kinds of things easier :)
It was fairly easy to isolate the call to obp_dumb_memalloc() for the region of memory in question, so I started tracing through the parent function, readfile, in kadb. Stepping through the code here showed a very interesting section whereby additional memory is added to the amount to be allocated for the kernel (and possibly modules) depending upon whether the variable use_align is set.
A quick watchpoint later showed that this is in fact controlled by the presence of a zero-length attribute called "aligned-allocator" under /openprom. Since this property appears in the sample prtconf output, then it suggests that this code path is triggered by OBP. Rather bizarrely, if I add this property to OpenBIOS then kadb crashes further down the line - BUT if I simply use gdb to make kadb think the property is present by changing the relevant register within gdb then I am able to load kadb?!
I think I'm probably about 90% there when it comes to getting kadb up and running under OpenBIOS, but I still need to figure out why just adding the missing property isn't enough here.
ATB,
Mark.
On 18/04/11 17:00, Mark Cave-Ayland wrote:
A quick watchpoint later showed that this is in fact controlled by the presence of a zero-length attribute called "aligned-allocator" under /openprom. Since this property appears in the sample prtconf output, then it suggests that this code path is triggered by OBP. Rather bizarrely, if I add this property to OpenBIOS then kadb crashes further down the line - BUT if I simply use gdb to make kadb think the property is present by changing the relevant register within gdb then I am able to load kadb?!
I think I'm probably about 90% there when it comes to getting kadb up and running under OpenBIOS, but I still need to figure out why just adding the missing property isn't enough here.
Okay - so it looks like I win the prize for discovering a new romvec entry point :) When the /openprom/aligned-allocator property exists with zero length, kadb somehow manages to get ufsboot to jump to a different offset within the romvec parameter block. Tracing the offset through gdb shows that it is trying to jump to the function pointer at this location in openprom.h:
int filler[15];
Hmmm. Since this was set to NULL, it was causing kadb to trap when trying to allocate memory from OpenBIOS. Anyhow the parameters look pretty much like obp_dumb_memalloc() except with an extra parameter probably giving the alignment. So with a quick patch I can now fire up kadb like this:
Configuration device id QEMU version 1 machine id 32 CPUs: 1 x FMI,MB86904 UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.0 built on Apr 18 2011 21:08 Type 'help' for detailed information
0 > boot cdrom:d kadb -kvd Not a bootable ELF image Loading a.out image... Loaded 7680 bytes entry point is 0x4000 bootpath: /iommu/sbus/espdma/esp/sd@2,0:d
Jumping to entry point 00004000 for type 00000005... switching to new context: Size: 119204+222573+28987 Bytes kadb: kadb: kernel/unix Size: 259040+54154+47486 Bytes /platform/SUNW,SPARCstation-5/kernel/unix loaded - 0x95000 bytes used kadb[0]: :c stopped at: scb: sethi %hi(0xf0041000), %l3 kadb[0]: :c SunOS Release 5.8 Version Generic_108528-09 32-bit Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved. \
Alas unfortunately I have no idea what this new mystery function is called, so if anyone can come up with a more suitable name please let me know.
ATB,
Mark.
On Tue, Apr 19, 2011 at 12:15 AM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
On 18/04/11 17:00, Mark Cave-Ayland wrote:
A quick watchpoint later showed that this is in fact controlled by the presence of a zero-length attribute called "aligned-allocator" under /openprom. Since this property appears in the sample prtconf output, then it suggests that this code path is triggered by OBP. Rather bizarrely, if I add this property to OpenBIOS then kadb crashes further down the line - BUT if I simply use gdb to make kadb think the property is present by changing the relevant register within gdb then I am able to load kadb?!
I think I'm probably about 90% there when it comes to getting kadb up and running under OpenBIOS, but I still need to figure out why just adding the missing property isn't enough here.
Okay - so it looks like I win the prize for discovering a new romvec entry point :) When the /openprom/aligned-allocator property exists with zero length, kadb somehow manages to get ufsboot to jump to a different offset within the romvec parameter block. Tracing the offset through gdb shows that it is trying to jump to the function pointer at this location in openprom.h:
int filler[15];
Hmmm. Since this was set to NULL, it was causing kadb to trap when trying to allocate memory from OpenBIOS. Anyhow the parameters look pretty much like obp_dumb_memalloc() except with an extra parameter probably giving the alignment. So with a quick patch I can now fire up kadb like this:
Great Job!
Configuration device id QEMU version 1 machine id 32 CPUs: 1 x FMI,MB86904 UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.0 built on Apr 18 2011 21:08 Type 'help' for detailed information
0 > boot cdrom:d kadb -kvd Not a bootable ELF image Loading a.out image... Loaded 7680 bytes entry point is 0x4000 bootpath: /iommu/sbus/espdma/esp/sd@2,0:d
Jumping to entry point 00004000 for type 00000005... switching to new context: Size: 119204+222573+28987 Bytes kadb: kadb: kernel/unix Size: 259040+54154+47486 Bytes /platform/SUNW,SPARCstation-5/kernel/unix loaded - 0x95000 bytes used kadb[0]: :c stopped at: scb: sethi %hi(0xf0041000), %l3 kadb[0]: :c SunOS Release 5.8 Version Generic_108528-09 32-bit Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved. \
Alas unfortunately I have no idea what this new mystery function is called, so if anyone can come up with a more suitable name please let me know.
Have you looked how the calling function is named? Maybe this could give a hint.
On 19/04/11 07:59, Artyom Tarasenko wrote:
Hmmm. Since this was set to NULL, it was causing kadb to trap when trying to allocate memory from OpenBIOS. Anyhow the parameters look pretty much like obp_dumb_memalloc() except with an extra parameter probably giving the alignment. So with a quick patch I can now fire up kadb like this:
Great Job!
Thanks :) I thought you may be interested as I suspect your kadb-fu is a lot stronger than mine. Having played around with kadb a short time, I've already decided that it makes gdb look like a friendly GUI application in comparison :O
Alas unfortunately I have no idea what this new mystery function is called, so if anyone can come up with a more suitable name please let me know.
Have you looked how the calling function is named? Maybe this could give a hint.
Alas kadb seems to jump into ufsboot in order to do the allocation (there are some string references to just "alloc"), and ufsboot is a stripped binary - presumably for space reasons.
ATB,
Mark.
On Tue, Apr 19, 2011 at 11:25 AM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
On 19/04/11 07:59, Artyom Tarasenko wrote:
Hmmm. Since this was set to NULL, it was causing kadb to trap when trying to allocate memory from OpenBIOS. Anyhow the parameters look pretty much like obp_dumb_memalloc() except with an extra parameter probably giving the alignment. So with a quick patch I can now fire up kadb like this:
Great Job!
Thanks :) I thought you may be interested as I suspect your kadb-fu is a lot stronger than mine. Having played around with kadb a short time, I've already decided that it makes gdb look like a friendly GUI application in comparison :O
Indeed gdb is friendly (taking into account the program sizes it's no wonder), but is there a way to make it load the symbols for the kernel _and_ the modules? I would switch to gdb then.
On 21/04/11 13:46, Artyom Tarasenko wrote:
Thanks :) I thought you may be interested as I suspect your kadb-fu is a lot stronger than mine. Having played around with kadb a short time, I've already decided that it makes gdb look like a friendly GUI application in comparison :O
Indeed gdb is friendly (taking into account the program sizes it's no wonder), but is there a way to make it load the symbols for the kernel _and_ the modules? I would switch to gdb then.
The part I'm playing with at the moment is setting deferred breakpoints, e.g.:
zs#zsa_init:b
That breaks, but I struggle with the basic commands to do anything. Perhaps I could really cheat by adding a native SPARC gdb executable into my Solaris 8 ISO image and then booting that instead of kadb? ;)
ATB,
Mark.
On Thu, Apr 21, 2011 at 6:46 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
On 21/04/11 13:46, Artyom Tarasenko wrote:
Thanks :) I thought you may be interested as I suspect your kadb-fu is a lot stronger than mine. Having played around with kadb a short time, I've already decided that it makes gdb look like a friendly GUI application in comparison :O
Indeed gdb is friendly (taking into account the program sizes it's no wonder), but is there a way to make it load the symbols for the kernel _and_ the modules? I would switch to gdb then.
The part I'm playing with at the moment is setting deferred breakpoints, e.g.:
zs#zsa_init:b
Thought you were past that point?
That breaks, but I struggle with the basic commands to do anything. Perhaps I could really cheat by adding a native SPARC gdb executable into my Solaris 8 ISO image and then booting that instead of kadb? ;)
You'd have to develop a Solaris kgdb for that ;-).