On 20/02/11 12:08, Mark Cave-Ayland wrote:
/iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 ESP: read reg[5]: 0x00 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 obp_proplen(0xffd4fb20, pm-hardware-state) (not found) obp_proplen(0xffd4fb20, pm-want-child-notification?) (not found) obp_fortheval_v2(0 0 f024360c f59a88c8 47 ['] find-device catch if 2drop true else current-device device-end then swap l!) ESP: write reg[4]: 0x00 -> 0x00 ESP: write reg[6]: 0x00 -> 0x00 ESP: write reg[7]: 0x00 -> 0x00 ESP: write reg[12]: 0x01 -> 0x01 ESP: write reg[8]: 0x17 -> 0x07 ESP: write reg[0]: 0x00 -> 0x07 ESP: write reg[1]: 0x00 -> 0x00 ESP: Raise enable ESP: write reg[3]: 0x80 -> 0xc2 ESP: Select with ATN (c2) ESP: get_cmd: len 7 target 0 ESP: Raise IRQ qemu: fatal: Trap 0x29 while interrupts disabled, Error state pc: f004127c npc: f0041280 General Registers: %g0-7: 00000000 f02441a0 04400fc1 00007000 f5af4e40 f0243b88 00000000 f0244020
Current Register Window: %o0-7: ffff8000 00008000 00000f00 044000c0 f5948688 ffed7000 fbe3a4b8 f0041be4 %l0-7: 04400fc0 f0041c78 f0041c7c 00000001 0000010f 00000001 0000002a fbe39f78 %i0-7: ffff8000 00008000 00000f00 044000c1 00000002 ffed7000 fbe3a020 f0041be4
Floating Point Registers: %f00: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f04: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f08: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f12: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f16: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f20: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f24: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f28: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 psr: 04000fc0 (icc: ---- SPE: SP-) wim: 00000001 fsr: 00080000 y: 00000000 Aborted
Looking at this further, it seems likely that this is happening because it is the first ESP DMA transfer invoked directly by the Solaris kernel. With some qemu hacking, I've managed to add some breakpoints so I can step through this particular section of code.
What happens is when the ESP IRQ is raised within qemu, we jump into the Solaris interrupt handler which then leads us into the _interrupt function. The fatal error occurs at this point:
build@zeno:~/src/openbios/openbios-git/openbios-devel$ sparc64-linux-gdb GNU gdb 6.8 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=x86_64-unknown-linux-gnu --target=sparc64-linux". (gdb) target remote :1234 Remote debugging using :1234 [New Thread 1] 0x00000000 in ?? () (gdb) file ../../ss5 A program is being debugged already. Are you sure you want to change the file? (y or n) y Reading symbols from /home/build/src/openbios/ss5...(no debugging symbols found)...done. (gdb) cont Continuing. ^C Program received signal SIGINT, Interrupt. 0xf00401f0 in scb () (gdb) break *0xf0041c74 Breakpoint 1 at 0xf0041c74 (gdb) cont Continuing.
Breakpoint 1, 0xf0041c74 in _interrupt () (gdb) info regi g0 0x0 0 g1 0xf02441a0 -266059360 g2 0x44000c0 71303360 g3 0x7000 28672 g4 0xf5af4e40 -173060544 g5 0x0 0 g6 0x0 0 g7 0xf0244020 -266059744 o0 0xffff8000 -32768 o1 0x8000 32768 o2 0x0 0 o3 0x44000c0 71303360 o4 0x1 1 o5 0xffed7000 -1216512 sp 0xfbe3bfa0 0xfbe3bfa0 o7 0xf0041be4 -268166172 l0 0x4400fc0 71307200 l1 0xf5af91bc -173043268 l2 0xf5af91c0 -173043264 l3 0xf00431c4 -268160572 l4 0xf 15 l5 0x44000c0 71303360 l6 0x1 1 l7 0xf0243058 -266063784 i0 0x1 1 i1 0x100 256 i2 0xc2 194 i3 0xf5906000 -175087616 i4 0x0 0 i5 0xf5948cc8 -174814008 fp 0xf0243100 0xf0243100 i7 0xf0042460 -268164000 y 0x0 0 psr 0x4400fc0 [ PS S #8 #9 #10 #11 #22 #26 ] wim 0x8 8 tbr 0xf00401f0 -268172816 pc 0xf0041c74 0xf0041c74 <_interrupt+244> npc 0xf0041c78 0xf0041c78 <_interrupt+248> fsr 0x80000 [ #19 ] csr 0x0 0
The offending piece of code just before the crash:
0xf0041c60 <_interrupt+224>: ld [ %g1 ], %g0 0xf0041c64 <_interrupt+228>: sethi %hi(0xf0244000), %g1 0xf0041c68 <_interrupt+232>: or %g1, 0x1a0, %g1 ! 0xf02441a0 <_int_vector> 0xf0041c6c <_interrupt+236>: ld [ %g1 + %l3 ], %l3 0xf0041c70 <_interrupt+240>: wr %g0, %l0, %psr 0xf0041c74 <_interrupt+244>: wr %l0, 0x20, %psr ^^^^^^^^^^^^^^^^^^^ fatal instuction (gdb) stepi Remote connection closed
So it seems to be this setting of the PSR register which is causing the fatal trap.
ATB,
Mark.