Hi all,
So with the previous patch to change the initial timer values applied to SVN trunk, we now get much further on a Solaris 8 boot:
Configuration device id QEMU version 1 machine id 32 CPUs: 1 x FMI,MB86904 UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.0 built on Feb 17 2011 14:56 Type 'help' for detailed information
0 > boot cdrom:d -v Not a bootable ELF image Loading a.out image... Loaded 7680 bytes entry point is 0x4000 bootpath: /iommu/sbus/espdma/esp/sd@2,0:d
Jumping to entry point 00004000 for type 00000005... switching to new context: Size: 259040+54154+47486 Bytes SunOS Release 5.8 Version Generic_108528-09 32-bit Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved. Ethernet address = 52:54:0:12:34:56 Using default device instance data vac: enabled in write through mode mem = 131072K (0x8000000) avail mem = 110419968 root nexus = SUNW,SPARCstation-5 iommu0 at root: obio 0x10000000 sbus0 at iommu0: obio 0x10001000 dma0 at sbus0: SBus slot 5 0x8400000 dma0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000 /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 qemu: fatal: Trap 0x29 while interrupts disabled, Error state pc: f004127c npc: f0041280 General Registers: %g0-7: 00000000 f02441a0 04400fc1 00007000 f5af4e40 f0243b88 00000000 f0244020
Current Register Window: %o0-7: ffff8000 00008000 00000f00 044000c0 f5948688 ffebc000 fbe3a4b8 f0041be4 %l0-7: 04400fc0 f0041c78 f0041c7c 00000001 0000010f 00000001 0000002a fbe39f78 %i0-7: ffff8000 00008000 00000f00 044000c1 00000002 ffebc000 fbe3a020 f0041be4
Floating Point Registers: %f00: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f04: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f08: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f12: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f16: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f20: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f24: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f28: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 psr: 04000fc0 (icc: ---- SPE: SP-) wim: 00000001 fsr: 00080000 y: 00000000 Aborted
From Artyom's OBP output, we can see that the next few lines that should appear on the console look like this:
sd2 at esp0: target 2 lun 0 sd2 is /iommu at 0,10000000/sbus at 0,10001000/espdma at 5,8400000/esp at 5,8800000/sd at 2,0 root on /iommu at 0,10000000/sbus at 0,10001000/espdma at 5,8400000/esp at 5,8800000/sd at 2,0:b
Therefore the crash is being caused by Solaris either trying to access the esp device or doing some kind of enumeration on the ESP bus.
ATB,
Mark.
On Thu, Feb 17, 2011 at 5:10 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
Hi all,
So with the previous patch to change the initial timer values applied to SVN trunk, we now get much further on a Solaris 8 boot:
Configuration device id QEMU version 1 machine id 32 CPUs: 1 x FMI,MB86904 UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.0 built on Feb 17 2011 14:56 Type 'help' for detailed information
0 > boot cdrom:d -v Not a bootable ELF image Loading a.out image... Loaded 7680 bytes entry point is 0x4000 bootpath: /iommu/sbus/espdma/esp/sd@2,0:d
Jumping to entry point 00004000 for type 00000005... switching to new context: Size: 259040+54154+47486 Bytes SunOS Release 5.8 Version Generic_108528-09 32-bit Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved. Ethernet address = 52:54:0:12:34:56 Using default device instance data vac: enabled in write through mode mem = 131072K (0x8000000) avail mem = 110419968 root nexus = SUNW,SPARCstation-5 iommu0 at root: obio 0x10000000 sbus0 at iommu0: obio 0x10001000 dma0 at sbus0: SBus slot 5 0x8400000 dma0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000 /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 qemu: fatal: Trap 0x29 while interrupts disabled, Error state pc: f004127c npc: f0041280 General Registers: %g0-7: 00000000 f02441a0 04400fc1 00007000 f5af4e40 f0243b88 00000000 f0244020
Current Register Window: %o0-7: ffff8000 00008000 00000f00 044000c0 f5948688 ffebc000 fbe3a4b8 f0041be4 %l0-7: 04400fc0 f0041c78 f0041c7c 00000001 0000010f 00000001 0000002a fbe39f78 %i0-7: ffff8000 00008000 00000f00 044000c1 00000002 ffebc000 fbe3a020 f0041be4
Floating Point Registers: %f00: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f04: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f08: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f12: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f16: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f20: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f24: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f28: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 psr: 04000fc0 (icc: ---- SPE: SP-) wim: 00000001 fsr: 00080000 y: 00000000 Aborted
From Artyom's OBP output, we can see that the next few lines that should appear on the console look like this:
sd2 at esp0: target 2 lun 0 sd2 is /iommu at 0,10000000/sbus at 0,10001000/espdma at 5,8400000/esp at 5,8800000/sd at 2,0 root on /iommu at 0,10000000/sbus at 0,10001000/espdma at 5,8400000/esp at 5,8800000/sd at 2,0:b
Therefore the crash is being caused by Solaris either trying to access the esp device or doing some kind of enumeration on the ESP bus.
ESP 'reg' property is not correct: reg: 00000005.08800000.00000040
reg 00000005 08800000 00000010
Could that cause the fault?
On 2011-2-17 3:55 PM, Blue Swirl wrote:
ESP 'reg' property is not correct: reg: 00000005.08800000.00000040
reg 00000005 08800000 00000010
Could that cause the fault?
Not easily. On a SPARC, registers are mapped on a page basis, and the PCI probe code is supposed to take the greater of "reg" size and actual BAR size for PCI address allocation. If openBios is believing the reg property to the exclusion of actual BAR size (allowing BARs to overlap each other), that could certainly mess things up, but I wouldn't expect the below error:
qemu: fatal: Trap 0x29 while interrupts disabled, Error state
Trap 0x29 is "Internal processor error". It would be instructive to see what's at f004127c (the trap pc), since that looks like it should be in the range of the PROM code, not Solaris. Alternatively, it might be interesting to see what happens by printing out the client-interface calls, see which call we're tripping over.
On 17/02/11 21:03, Tarl Neustaedter wrote:
qemu: fatal: Trap 0x29 while interrupts disabled, Error state
Trap 0x29 is "Internal processor error". It would be instructive to see what's at f004127c (the trap pc), since that looks like it should be in the range of the PROM code, not Solaris. Alternatively, it might be interesting to see what happens by printing out the client-interface calls, see which call we're tripping over.
Well I turned on some logging in OpenBIOS but couldn't see any ESP accesses during the SCSI bus probe. However, I tried enabling ESP debugging in qemu and that does show ESP accesses, implying that Solaris is now accessing the SCSI bus itself rather than through OpenBIOS.
The relevant section of the log looks like this:
vac: enabled in write through mode mem = 131072K (0x8000000) avail mem = 110419968 obp_nextnode(0x0) = 0xffd4527c obp_proplen(0xffd4527c, reg) (not found) obp_proplen(0xffd4527c, ranges) (not found) obp_proplen(0xffd4527c, intr) (not found) obp_proplen(0xffd4527c, interrupts) (not found) obp_proplen(0xffd4527c, ttymodes) (not found) obp_proplen(0xffd4527c, device_type) (not found) root nexus = SUNW,SPARCstation-5 obp_proplen(0xffd4527c, pm-hardware-state) (not found) obp_proplen(0xffd4527c, pm-want-child-notification?) (not found) obp_proplen(0xffd4527c, pm-components) (not found) obp_proplen(0xffd455e4, reg) (not found) obp_proplen(0xffd455e4, ranges) (not found) obp_proplen(0xffd455e4, intr) (not found) obp_proplen(0xffd455e4, interrupts) (not found) obp_proplen(0xffd455e4, reg) (not found) obp_proplen(0xffd455e4, ranges) (not found) obp_proplen(0xffd455e4, intr) (not found) obp_proplen(0xffd455e4, interrupts) (not found) obp_proplen(0xffd455e4, device_type) (not found) obp_proplen(0xffd455e4, pm-hardware-state) (not found) obp_proplen(0xffd455e4, pm-want-child-notification?) (not found) obp_proplen(0xffd455e4, pm-components) (not found) obp_fortheval_v2(0 0 f024360c f59a88c8 11 ['] find-device catch if 2drop true else current-device device-end then swap l!) obp_proplen(0xffd4caa4, reg) = 12 obp_proplen(0xffd4caa4, reg) = 12 obp_getprop(0xffd4caa4, reg) = 00 00 00 00 10 00 00 00 00 00 03 00 obp_proplen(0xffd4caa4, ranges) (not found) obp_proplen(0xffd4caa4, intr) (not found) obp_proplen(0xffd4caa4, interrupts) (not found) obp_proplen(0xffd4caa4, device_type) (not found) iommu0 at root: obio 0x10000000 obp_proplen(0xffd4caa4, pm-want-child-notification?) (not found) obp_proplen(0xffd4caa4, pm-components) (not found) obp_fortheval_v2(0 0 f024360c f59a88c8 21 ['] find-device catch if 2drop true else current-device device-end then swap l!) obp_proplen(0xffd4cc14, reg) = 12 obp_proplen(0xffd4cc14, reg) = 12 obp_getprop(0xffd4cc14, reg) = 00 00 00 00 10 00 10 00 00 00 00 28 obp_proplen(0xffd4cc14, ranges) = 120 obp_proplen(0xffd4cc14, ranges) = 120 obp_getprop(0xffd4cc14, ranges) = 00 00 00 00 00 00 00 00 00 00 00 00 20 00 00 00 10 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 30 00 00 00 10 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 40 00 00 00 10 00 00 00 00 00 00 03 00 00 00 00 00 00 00 00 50 00 00 00 10 00 00 00 00 00 00 04 00 00 00 00 00 00 00 00 60 00 00 00 10 00 00 00 00 00 00 05 00 00 00 00 00 00 00 00 70 00 00 00 10 00 00 00 obp_proplen(0xffd4cc14, intr) (not found) obp_proplen(0xffd4cc14, interrupts) (not found) obp_proplen(0xffd4cc14, device_type) = 13 obp_proplen(0xffd4cc14, device_type) = 13 obp_getprop(0xffd4cc14, device_type) = hierarchical sbus0 at iommu0: obio 0x10001000 obp_proplen(0xffd4cc14, burst-sizes) = 4 obp_proplen(0xffd4cc14, burst-sizes) = 4 obp_getprop(0xffd4cc14, burst-sizes) = 0000003f obp_proplen(0xffd4cc14, pm-hardware-state) (not found) obp_proplen(0xffd4cc14, pm-want-child-notification?) (not found) obp_proplen(0xffd4cc14, pm-components) (not found) obp_fortheval_v2(0 0 f024360c f59a88c8 32 ['] find-device catch if 2drop true else current-device device-end then swap l!) obp_proplen(0xffd4cfec, reg) = 12 obp_proplen(0xffd4cfec, reg) = 12 obp_getprop(0xffd4cfec, reg) = 00 00 00 05 08 40 00 00 00 00 00 10 obp_proplen(0xffd4cfec, ranges) (not found) obp_proplen(0xffd4cfec, intr) (not found) obp_proplen(0xffd4cfec, interrupts) (not found) obp_proplen(0xffd4cfec, device_type) (not found) dma0 at sbus0: SBus slot 5 0x8400000 dma0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000 obp_proplen(0xffd4cfec, pm-hardware-state) (not found) obp_proplen(0xffd4cfec, pm-want-child-notification?) (not found) obp_proplen(0xffd4cfec, pm-components) (not found) obp_fortheval_v2(0 0 f024360c f59a88c8 40 ['] find-device catch if 2drop true else current-device device-end then swap l!) obp_proplen(0xffd4fb20, reg) = 12 obp_proplen(0xffd4fb20, reg) = 12 obp_getprop(0xffd4fb20, reg) = 00 00 00 05 08 80 00 00 00 00 00 10 obp_proplen(0xffd4fb20, ranges) (not found) obp_proplen(0xffd4fb20, intr) = 8 obp_proplen(0xffd4fb20, intr) = 8 obp_getprop(0xffd4fb20, intr) = 00 00 00 24 00 00 00 00 obp_proplen(0xffd4fb20, interrupts) (not found) obp_proplen(0xffd4fb20, device_type) = 5 obp_proplen(0xffd4fb20, device_type) = 5 obp_getprop(0xffd4fb20, device_type) = scsi obp_proplen(0xffd4cc14, slave-only) (not found) obp_proplen(0xffd4caa4, slave-only) (not found) obp_proplen(0xffd4527c, slave-only) (not found) obp_proplen(0xffd455e4, slave-only) (not found) obp_proplen(0xffd4fb20, initiator-id) (not found) obp_proplen(0xffd4cfec, initiator-id) (not found) obp_proplen(0xffd4cc14, initiator-id) (not found) obp_proplen(0xffd4caa4, initiator-id) (not found) obp_proplen(0xffd4527c, initiator-id) (not found) obp_proplen(0xffd455e4, initiator-id) (not found) obp_proplen(0xffd4fb20, scsi-initiator-id) (not found) obp_proplen(0xffd4cfec, scsi-initiator-id) (not found) obp_proplen(0xffd4cc14, scsi-initiator-id) (not found) obp_proplen(0xffd4caa4, scsi-initiator-id) (not found) obp_proplen(0xffd4527c, scsi-initiator-id) (not found) obp_proplen(0xffd455e4, scsi-initiator-id) (not found) obp_proplen(0xffd4fb20, scsi-reset-delay) (not found) obp_proplen(0xffd4cfec, scsi-reset-delay) (not found) obp_proplen(0xffd4cc14, scsi-reset-delay) (not found) obp_proplen(0xffd4caa4, scsi-reset-delay) (not found) obp_proplen(0xffd4527c, scsi-reset-delay) (not found) obp_proplen(0xffd455e4, scsi-reset-delay) (not found) obp_proplen(0xffd4fb20, scsi-tag-age-limit) (not found) obp_proplen(0xffd4cfec, scsi-tag-age-limit) (not found) obp_proplen(0xffd4cc14, scsi-tag-age-limit) (not found) obp_proplen(0xffd4caa4, scsi-tag-age-limit) (not found) obp_proplen(0xffd4527c, scsi-tag-age-limit) (not found) obp_proplen(0xffd455e4, scsi-tag-age-limit) (not found) obp_proplen(0xffd4fb20, scsi-watchdog-tick) (not found) obp_proplen(0xffd4cfec, scsi-watchdog-tick) (not found) obp_proplen(0xffd4cc14, scsi-watchdog-tick) (not found) obp_proplen(0xffd4caa4, scsi-watchdog-tick) (not found) obp_proplen(0xffd4527c, scsi-watchdog-tick) (not found) obp_proplen(0xffd455e4, scsi-watchdog-tick) (not found) obp_proplen(0xffd4fb20, scsi-options) (not found) obp_proplen(0xffd4cfec, scsi-options) (not found) obp_proplen(0xffd4cc14, scsi-options) (not found) obp_proplen(0xffd4caa4, scsi-options) (not found) obp_proplen(0xffd4527c, scsi-options) (not found) obp_proplen(0xffd455e4, scsi-options) (not found) obp_proplen(0xffd4fb20, scsi-selection-timeout) (not found) obp_proplen(0xffd4cfec, scsi-selection-timeout) (not found) obp_proplen(0xffd4cc14, scsi-selection-timeout) (not found) obp_proplen(0xffd4caa4, scsi-selection-timeout) (not found) obp_proplen(0xffd4527c, scsi-selection-timeout) (not found) obp_proplen(0xffd455e4, scsi-selection-timeout) (not found) obp_proplen(0xffd4fb20, target0-scsi-options) (not found) obp_proplen(0xffd4cfec, target0-scsi-options) (not found) obp_proplen(0xffd4cc14, target0-scsi-options) (not found) obp_proplen(0xffd4caa4, target0-scsi-options) (not found) obp_proplen(0xffd4527c, target0-scsi-options) (not found) obp_proplen(0xffd455e4, target0-scsi-options) (not found) obp_proplen(0xffd4fb20, target1-scsi-options) (not found) obp_proplen(0xffd4cfec, target1-scsi-options) (not found) obp_proplen(0xffd4cc14, target1-scsi-options) (not found) obp_proplen(0xffd4caa4, target1-scsi-options) (not found) obp_proplen(0xffd4527c, target1-scsi-options) (not found) obp_proplen(0xffd455e4, target1-scsi-options) (not found) obp_proplen(0xffd4fb20, target2-scsi-options) (not found) obp_proplen(0xffd4cfec, target2-scsi-options) (not found) obp_proplen(0xffd4cc14, target2-scsi-options) (not found) obp_proplen(0xffd4caa4, target2-scsi-options) (not found) obp_proplen(0xffd4527c, target2-scsi-options) (not found) obp_proplen(0xffd455e4, target2-scsi-options) (not found) obp_proplen(0xffd4fb20, target3-scsi-options) (not found) obp_proplen(0xffd4cfec, target3-scsi-options) (not found) obp_proplen(0xffd4cc14, target3-scsi-options) (not found) obp_proplen(0xffd4caa4, target3-scsi-options) (not found) obp_proplen(0xffd4527c, target3-scsi-options) (not found) obp_proplen(0xffd455e4, target3-scsi-options) (not found) obp_proplen(0xffd4fb20, target4-scsi-options) (not found) obp_proplen(0xffd4cfec, target4-scsi-options) (not found) obp_proplen(0xffd4cc14, target4-scsi-options) (not found) obp_proplen(0xffd4caa4, target4-scsi-options) (not found) obp_proplen(0xffd4527c, target4-scsi-options) (not found) obp_proplen(0xffd455e4, target4-scsi-options) (not found) obp_proplen(0xffd4fb20, target5-scsi-options) (not found) obp_proplen(0xffd4cfec, target5-scsi-options) (not found) obp_proplen(0xffd4cc14, target5-scsi-options) (not found) obp_proplen(0xffd4caa4, target5-scsi-options) (not found) obp_proplen(0xffd4527c, target5-scsi-options) (not found) obp_proplen(0xffd455e4, target5-scsi-options) (not found) obp_proplen(0xffd4fb20, target6-scsi-options) (not found) obp_proplen(0xffd4cfec, target6-scsi-options) (not found) obp_proplen(0xffd4cc14, target6-scsi-options) (not found) obp_proplen(0xffd4caa4, target6-scsi-options) (not found) obp_proplen(0xffd4527c, target6-scsi-options) (not found) obp_proplen(0xffd455e4, target6-scsi-options) (not found) obp_proplen(0xffd4fb20, target7-scsi-options) (not found) obp_proplen(0xffd4cfec, target7-scsi-options) (not found) obp_proplen(0xffd4cc14, target7-scsi-options) (not found) obp_proplen(0xffd4caa4, target7-scsi-options) (not found) obp_proplen(0xffd4527c, target7-scsi-options) (not found) obp_proplen(0xffd455e4, target7-scsi-options) (not found) obp_proplen(0xffd4fb20, clock-frequency) = 4 obp_proplen(0xffd4fb20, clock-frequency) = 4 obp_getprop(0xffd4fb20, clock-frequency) = 02625a00 ESP: write reg[11]: 0x00 -> 0x00 ESP: write reg[11]: 0x00 -> 0x0a ESP: read reg[11]: 0x0a ESP: write reg[12]: 0x00 -> 0x00 ESP: write reg[12]: 0x00 -> 0x05 ESP: read reg[12]: 0x05 ESP: write reg[11]: 0x0a -> 0x08 ESP: write reg[12]: 0x05 -> 0x00 ESP: write reg[3]: 0x90 -> 0x03 ESP: Bus reset (03) ESP: Raise IRQ ESP: Lower enable ESP: write reg[3]: 0x00 -> 0x02 ESP: Chip reset (02) ESP: write reg[3]: 0x02 -> 0x80 ESP: NOP (80) ESP: write reg[3]: 0x80 -> 0x80 ESP: NOP (80) ESP: write reg[9]: 0x00 -> 0x00 ESP: write reg[5]: 0x00 -> 0xa3 ESP: write reg[6]: 0x00 -> 0x00 ESP: write reg[7]: 0x00 -> 0x00 ESP: read reg[14]: 0x04 ESP: read reg[14]: 0x04 ESP: write reg[8]: 0x00 -> 0x17 ESP: write reg[12]: 0x00 -> 0x01 ESP: write reg[11]: 0x00 -> 0x08 obp_proplen(0xffd4fb20, esp-options) (not found) obp_proplen(0xffd4cfec, esp-options) (not found) obp_proplen(0xffd4cc14, esp-options) (not found) obp_proplen(0xffd4caa4, esp-options) (not found) obp_proplen(0xffd4527c, esp-options) (not found) obp_proplen(0xffd455e4, esp-options) (not found) /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 ESP: read reg[5]: 0x00 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 obp_proplen(0xffd4fb20, pm-hardware-state) (not found) obp_proplen(0xffd4fb20, pm-want-child-notification?) (not found) obp_fortheval_v2(0 0 f024360c f59a88c8 47 ['] find-device catch if 2drop true else current-device device-end then swap l!) ESP: write reg[4]: 0x00 -> 0x00 ESP: write reg[6]: 0x00 -> 0x00 ESP: write reg[7]: 0x00 -> 0x00 ESP: write reg[12]: 0x01 -> 0x01 ESP: write reg[8]: 0x17 -> 0x07 ESP: write reg[0]: 0x00 -> 0x07 ESP: write reg[1]: 0x00 -> 0x00 ESP: Raise enable ESP: write reg[3]: 0x80 -> 0xc2 ESP: Select with ATN (c2) ESP: get_cmd: len 7 target 0 ESP: Raise IRQ qemu: fatal: Trap 0x29 while interrupts disabled, Error state pc: f004127c npc: f0041280 General Registers: %g0-7: 00000000 f02441a0 04400fc1 00007000 f5af4e40 f0243b88 00000000 f0244020
Current Register Window: %o0-7: ffff8000 00008000 00000f00 044000c0 f5948688 ffed7000 fbe3a4b8 f0041be4 %l0-7: 04400fc0 f0041c78 f0041c7c 00000001 0000010f 00000001 0000002a fbe39f78 %i0-7: ffff8000 00008000 00000f00 044000c1 00000002 ffed7000 fbe3a020 f0041be4
Floating Point Registers: %f00: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f04: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f08: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f12: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f16: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f20: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f24: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f28: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 psr: 04000fc0 (icc: ---- SPE: SP-) wim: 00000001 fsr: 00080000 y: 00000000 Aborted
In terms of the Solaris code, the trap appears to be coming from an interrupt routine which makes me wonder if OpenBIOS perhaps isn't leaving the ESP in a tidy state, so that when Solaris finally takes over the bus it causes a crash.
ATB,
Mark.
On 20/02/11 12:08, Mark Cave-Ayland wrote:
/iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 ESP: read reg[5]: 0x00 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 obp_proplen(0xffd4fb20, pm-hardware-state) (not found) obp_proplen(0xffd4fb20, pm-want-child-notification?) (not found) obp_fortheval_v2(0 0 f024360c f59a88c8 47 ['] find-device catch if 2drop true else current-device device-end then swap l!) ESP: write reg[4]: 0x00 -> 0x00 ESP: write reg[6]: 0x00 -> 0x00 ESP: write reg[7]: 0x00 -> 0x00 ESP: write reg[12]: 0x01 -> 0x01 ESP: write reg[8]: 0x17 -> 0x07 ESP: write reg[0]: 0x00 -> 0x07 ESP: write reg[1]: 0x00 -> 0x00 ESP: Raise enable ESP: write reg[3]: 0x80 -> 0xc2 ESP: Select with ATN (c2) ESP: get_cmd: len 7 target 0 ESP: Raise IRQ qemu: fatal: Trap 0x29 while interrupts disabled, Error state pc: f004127c npc: f0041280 General Registers: %g0-7: 00000000 f02441a0 04400fc1 00007000 f5af4e40 f0243b88 00000000 f0244020
Current Register Window: %o0-7: ffff8000 00008000 00000f00 044000c0 f5948688 ffed7000 fbe3a4b8 f0041be4 %l0-7: 04400fc0 f0041c78 f0041c7c 00000001 0000010f 00000001 0000002a fbe39f78 %i0-7: ffff8000 00008000 00000f00 044000c1 00000002 ffed7000 fbe3a020 f0041be4
Floating Point Registers: %f00: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f04: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f08: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f12: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f16: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f20: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f24: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 %f28: 000000000.000000 000000000.000000 000000000.000000 000000000.000000 psr: 04000fc0 (icc: ---- SPE: SP-) wim: 00000001 fsr: 00080000 y: 00000000 Aborted
Looking at this further, it seems likely that this is happening because it is the first ESP DMA transfer invoked directly by the Solaris kernel. With some qemu hacking, I've managed to add some breakpoints so I can step through this particular section of code.
What happens is when the ESP IRQ is raised within qemu, we jump into the Solaris interrupt handler which then leads us into the _interrupt function. The fatal error occurs at this point:
build@zeno:~/src/openbios/openbios-git/openbios-devel$ sparc64-linux-gdb GNU gdb 6.8 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=x86_64-unknown-linux-gnu --target=sparc64-linux". (gdb) target remote :1234 Remote debugging using :1234 [New Thread 1] 0x00000000 in ?? () (gdb) file ../../ss5 A program is being debugged already. Are you sure you want to change the file? (y or n) y Reading symbols from /home/build/src/openbios/ss5...(no debugging symbols found)...done. (gdb) cont Continuing. ^C Program received signal SIGINT, Interrupt. 0xf00401f0 in scb () (gdb) break *0xf0041c74 Breakpoint 1 at 0xf0041c74 (gdb) cont Continuing.
Breakpoint 1, 0xf0041c74 in _interrupt () (gdb) info regi g0 0x0 0 g1 0xf02441a0 -266059360 g2 0x44000c0 71303360 g3 0x7000 28672 g4 0xf5af4e40 -173060544 g5 0x0 0 g6 0x0 0 g7 0xf0244020 -266059744 o0 0xffff8000 -32768 o1 0x8000 32768 o2 0x0 0 o3 0x44000c0 71303360 o4 0x1 1 o5 0xffed7000 -1216512 sp 0xfbe3bfa0 0xfbe3bfa0 o7 0xf0041be4 -268166172 l0 0x4400fc0 71307200 l1 0xf5af91bc -173043268 l2 0xf5af91c0 -173043264 l3 0xf00431c4 -268160572 l4 0xf 15 l5 0x44000c0 71303360 l6 0x1 1 l7 0xf0243058 -266063784 i0 0x1 1 i1 0x100 256 i2 0xc2 194 i3 0xf5906000 -175087616 i4 0x0 0 i5 0xf5948cc8 -174814008 fp 0xf0243100 0xf0243100 i7 0xf0042460 -268164000 y 0x0 0 psr 0x4400fc0 [ PS S #8 #9 #10 #11 #22 #26 ] wim 0x8 8 tbr 0xf00401f0 -268172816 pc 0xf0041c74 0xf0041c74 <_interrupt+244> npc 0xf0041c78 0xf0041c78 <_interrupt+248> fsr 0x80000 [ #19 ] csr 0x0 0
The offending piece of code just before the crash:
0xf0041c60 <_interrupt+224>: ld [ %g1 ], %g0 0xf0041c64 <_interrupt+228>: sethi %hi(0xf0244000), %g1 0xf0041c68 <_interrupt+232>: or %g1, 0x1a0, %g1 ! 0xf02441a0 <_int_vector> 0xf0041c6c <_interrupt+236>: ld [ %g1 + %l3 ], %l3 0xf0041c70 <_interrupt+240>: wr %g0, %l0, %psr 0xf0041c74 <_interrupt+244>: wr %l0, 0x20, %psr ^^^^^^^^^^^^^^^^^^^ fatal instuction (gdb) stepi Remote connection closed
So it seems to be this setting of the PSR register which is causing the fatal trap.
ATB,
Mark.
On 20/02/11 15:27, Mark Cave-Ayland wrote:
So it seems to be this setting of the PSR register which is causing the fatal trap.
Some more hacking around with gdb on qemu shows that actually in fact, the spurious interrupts being generated are not related to the DMA transfers but have exception_index 0x1f which translates to IRQ 15.
Searching on the internet seems to suggest that IRQ 15 is used for distributing interrupts on multi-processor systems - so perhaps it is related to OpenBIOS' SMP setup? With a conditional breakpoint, I can see that they only start appearing upon activation of the first DMA transfer initiated by the Solaris kernel.
ATB,
Mark.
On 20/02/11 20:28, Mark Cave-Ayland wrote:
Searching on the internet seems to suggest that IRQ 15 is used for distributing interrupts on multi-processor systems - so perhaps it is related to OpenBIOS' SMP setup? With a conditional breakpoint, I can see that they only start appearing upon activation of the first DMA transfer initiated by the Solaris kernel.
Ah so I think I have found the answer here: http://www.mail-archive.com/qemu-devel@nongnu.org/msg39793.html. So basically it seems this IRQ is being raised by the IOMMU because an attempt is made to access an unmapped I/O address.
Using a debug build of qemu, I placed a breakpoint on iommu_bad_addr in order to get a backtrace, and sure enough just after the Select with ATN I get a break here:
Breakpoint 1, iommu_bad_addr (s=0x1726020, addr=4227858432, is_write=0) at /home/build/src/qemu/git/qemu/hw/sun4m_iommu.c:279 279 { (gdb) p/x 4227858432 $1 = 0xfc000000 (gdb) bt #0 iommu_bad_addr (s=0x1726020, addr=4227858432, is_write=0) at /home/build/src/qemu/git/qemu/hw/sun4m_iommu.c:279 #1 0x0000000000549919 in sparc_iommu_memory_rw (opaque=0x1726020, addr=4227858464, buf=0x7fffac0c4560 "\374\377\377\377\377\377\377\377\032\324T", len=7, is_write=0) at /home/build/src/qemu/git/qemu/hw/sun4m_iommu.c:303 #2 0x000000000054ccb7 in sparc_iommu_memory_read (opaque=0x1726020, addr=4227858464, buf=0x7fffac0c4560 "\374\377\377\377\377\377\377\377\032\324T", len=7) at /home/build/src/qemu/git/qemu/hw/sun4m.h:15 #3 0x000000000054d2a1 in espdma_memory_read (opaque=0x182ce70, buf=0x7fffac0c4560 "\374\377\377\377\377\377\377\377\032\324T", len=7) at /home/build/src/qemu/git/qemu/hw/sparc32_dma.c:154 #4 0x000000000053f57b in get_cmd (s=0x1a66020, buf=0x7fffac0c4560 "\374\377\377\377\377\377\377\377\032\324T") at /home/build/src/qemu/git/qemu/hw/esp.c:199 #5 0x000000000053f92e in handle_satn (s=0x1a66020) at /home/build/src/qemu/git/qemu/hw/esp.c:276 #6 0x00000000005409ec in esp_mem_writeb (opaque=0x1a66020, addr=12, val=194) at /home/build/src/qemu/git/qemu/hw/esp.c:631 #7 0x00000000004f151a in subpage_writelen (mmio=0x1a6c190, addr=12, value=194, len=0) at /home/build/src/qemu/git/qemu/exec.c:3260 #8 0x00000000004f15c0 in subpage_writeb (opaque=0x1a6c190, addr=12, value=194) at /home/build/src/qemu/git/qemu/exec.c:3271 #9 0x0000000000522797 in io_writeb (physaddr=12, val=194 '\302', addr=4119879692, retaddr=0x4134d166) at ../softmmu_template.h:213 #10 0x000000000052287c in __stb_mmu (addr=4119879692, val=194 '\302', mmu_idx=1) at ../softmmu_template.h:245 #11 0x000000004134d167 in ?? () #12 0x0000000000000001 in ?? () #13 0x1e8dd2975dfbc400 in ?? () #14 0x0000000000000000 in ?? ()
So it looks like the IOMMU doesn't have an entry mapped at 0xfc000000 which is causing it to raise IRQ15. Now I see that OpenBIOS doesn't map that range in drivers/iommu.c - is that something we should be doing?
The other interesting part is why address 0xfc000000, since OpenBIOS must surely have pre-mapped ESP somewhere above 0xff000000 to be able to bootstrap the kernel from disk using DMA?
I see a reference here: http://www.mail-archive.com/qemu-devel@nongnu.org/msg13315.html that explicitly mentions this:
"Ok, I'll try to make a mental exercise with this chain: SCSI->ESP->ESPDMA->IOMMU->memory write. Scenario: SCSI read issued, 8k size. I'll track the address+size vectors at each stage.
scsi-disk uses host memory addresses. ESP uses addresses ranging from 0 to end of request. ESPDMA forces the MS byte to be 0xfc."
At the moment I can't quite figure out how/why this is happening in QEMU - does anyone have any pointers?
ATB,
Mark.
On Wed, Feb 23, 2011 at 7:12 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
On 20/02/11 20:28, Mark Cave-Ayland wrote:
Searching on the internet seems to suggest that IRQ 15 is used for distributing interrupts on multi-processor systems - so perhaps it is related to OpenBIOS' SMP setup? With a conditional breakpoint, I can see that they only start appearing upon activation of the first DMA transfer initiated by the Solaris kernel.
Ah so I think I have found the answer here: http://www.mail-archive.com/qemu-devel@nongnu.org/msg39793.html. So basically it seems this IRQ is being raised by the IOMMU because an attempt is made to access an unmapped I/O address.
Using a debug build of qemu, I placed a breakpoint on iommu_bad_addr in order to get a backtrace, and sure enough just after the Select with ATN I get a break here:
Breakpoint 1, iommu_bad_addr (s=0x1726020, addr=4227858432, is_write=0) at /home/build/src/qemu/git/qemu/hw/sun4m_iommu.c:279 279 { (gdb) p/x 4227858432 $1 = 0xfc000000 (gdb) bt #0 iommu_bad_addr (s=0x1726020, addr=4227858432, is_write=0) at /home/build/src/qemu/git/qemu/hw/sun4m_iommu.c:279 #1 0x0000000000549919 in sparc_iommu_memory_rw (opaque=0x1726020, addr=4227858464, buf=0x7fffac0c4560 "\374\377\377\377\377\377\377\377\032\324T", len=7, is_write=0) at /home/build/src/qemu/git/qemu/hw/sun4m_iommu.c:303 #2 0x000000000054ccb7 in sparc_iommu_memory_read (opaque=0x1726020, addr=4227858464, buf=0x7fffac0c4560 "\374\377\377\377\377\377\377\377\032\324T", len=7) at /home/build/src/qemu/git/qemu/hw/sun4m.h:15 #3 0x000000000054d2a1 in espdma_memory_read (opaque=0x182ce70, buf=0x7fffac0c4560 "\374\377\377\377\377\377\377\377\032\324T", len=7) at /home/build/src/qemu/git/qemu/hw/sparc32_dma.c:154 #4 0x000000000053f57b in get_cmd (s=0x1a66020, buf=0x7fffac0c4560 "\374\377\377\377\377\377\377\377\032\324T") at /home/build/src/qemu/git/qemu/hw/esp.c:199 #5 0x000000000053f92e in handle_satn (s=0x1a66020) at /home/build/src/qemu/git/qemu/hw/esp.c:276 #6 0x00000000005409ec in esp_mem_writeb (opaque=0x1a66020, addr=12, val=194) at /home/build/src/qemu/git/qemu/hw/esp.c:631 #7 0x00000000004f151a in subpage_writelen (mmio=0x1a6c190, addr=12, value=194, len=0) at /home/build/src/qemu/git/qemu/exec.c:3260 #8 0x00000000004f15c0 in subpage_writeb (opaque=0x1a6c190, addr=12, value=194) at /home/build/src/qemu/git/qemu/exec.c:3271 #9 0x0000000000522797 in io_writeb (physaddr=12, val=194 '\302', addr=4119879692, retaddr=0x4134d166) at ../softmmu_template.h:213 #10 0x000000000052287c in __stb_mmu (addr=4119879692, val=194 '\302', mmu_idx=1) at ../softmmu_template.h:245 #11 0x000000004134d167 in ?? () #12 0x0000000000000001 in ?? () #13 0x1e8dd2975dfbc400 in ?? () #14 0x0000000000000000 in ?? ()
So it looks like the IOMMU doesn't have an entry mapped at 0xfc000000 which is causing it to raise IRQ15. Now I see that OpenBIOS doesn't map that range in drivers/iommu.c - is that something we should be doing?
IOMMU virtual address space is completely separate from MMU virtual address space. 0xfc000000 does not need to be mapped by MMU. This area is determined by the mapping range bits in IOMMU control register.
The other interesting part is why address 0xfc000000, since OpenBIOS must surely have pre-mapped ESP somewhere above 0xff000000 to be able to bootstrap the kernel from disk using DMA?
The address 0xfc000000 comes from ESP DMA controller's address register, ESP can only supply lowest 24 bits of the address.
A very similar problem prevented NetBSD 1.6.x boot, because NetBSD programmed the DMA controller and ESP in different order that what QEMU expected. This is now fixed, so it shouldn't be a problem anymore and since OBP works, the problem at hand is not in QEMU side.
Maybe Solaris also programs DMA and ESP in some different way, which causes DMA to fire before the address is set up. To fix it, maybe OpenBIOS should reset the DMA controller after using it or at least disable DMA.
On 25/02/11 15:54, Blue Swirl wrote:
So it looks like the IOMMU doesn't have an entry mapped at 0xfc000000 which is causing it to raise IRQ15. Now I see that OpenBIOS doesn't map that range in drivers/iommu.c - is that something we should be doing?
IOMMU virtual address space is completely separate from MMU virtual address space. 0xfc000000 does not need to be mapped by MMU. This area is determined by the mapping range bits in IOMMU control register.
The other interesting part is why address 0xfc000000, since OpenBIOS must surely have pre-mapped ESP somewhere above 0xff000000 to be able to bootstrap the kernel from disk using DMA?
The address 0xfc000000 comes from ESP DMA controller's address register, ESP can only supply lowest 24 bits of the address.
A very similar problem prevented NetBSD 1.6.x boot, because NetBSD programmed the DMA controller and ESP in different order that what QEMU expected. This is now fixed, so it shouldn't be a problem anymore and since OBP works, the problem at hand is not in QEMU side.
Maybe Solaris also programs DMA and ESP in some different way, which causes DMA to fire before the address is set up. To fix it, maybe OpenBIOS should reset the DMA controller after using it or at least disable DMA.
Okay. Well I added some debugging to hw/sun4m_iommu.c to see what was happening and managed to get the following trace:
Welcome to OpenBIOS v1.0 built on Feb 19 2011 17:00 Type 'help' for detailed information
0 > boot cdrom:d -v Not a bootable ELF image Loading a.out image... Loaded 7680 bytes entry point is 0x4000 bootpath: /iommu/sbus/espdma/esp/sd@2,0:d
Jumping to entry point 00004000 for type 00000005... switching to new context: Size: 259040+54154+47486 Bytes SunOS Release 5.8 Version Generic_108528-09 32-bit Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved. Ethernet address = 52:54:0:12:34:56 Using default device instance data vac: enabled in write through mode mem = 131072K (0x8000000) avail mem = 110419968 ### Writing to iommu addr: 1 ### Setting IOMMU base addr: 6bc000 ### Writing to iommu addr: 0 ### Writing to iommu addr: 5 ### IOMMU TLB flush 0 root nexus = SUNW,SPARCstation-5 iommu0 at root: obio 0x10000000 sbus0 at iommu0: obio 0x10001000 dma0 at sbus0: SBus slot 5 0x8400000 dma0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000 ### Writing to iommu addr: 6 ### IOMMU page flush fc000000 /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 ### Writing to iommu addr: 6 ### IOMMU page flush fc001000 (crash cut)
So it looks like Solaris has already taken over the IOMMU page table by setting a new base address, and appears to be flushing 2 entries for 0xfc000000 and 0xfc001000 which looks like it should be doing the right thing. If Solaris has taken over the IOMMU page table, then how could OpenBIOS affect this?
Also, what do the IOMMU_RNGE_*MB constants do? Do they control the size of the IOMMU page table? Sorry to keep asking these questions but it seems that Oracle have removed all of the documents referenced in the various IOMMU source files :(
ATB,
Mark.
On Sat, Feb 26, 2011 at 3:56 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
On 25/02/11 15:54, Blue Swirl wrote:
So it looks like the IOMMU doesn't have an entry mapped at 0xfc000000 which is causing it to raise IRQ15. Now I see that OpenBIOS doesn't map that range in drivers/iommu.c - is that something we should be doing?
IOMMU virtual address space is completely separate from MMU virtual address space. 0xfc000000 does not need to be mapped by MMU. This area is determined by the mapping range bits in IOMMU control register.
The other interesting part is why address 0xfc000000, since OpenBIOS must surely have pre-mapped ESP somewhere above 0xff000000 to be able to bootstrap the kernel from disk using DMA?
The address 0xfc000000 comes from ESP DMA controller's address register, ESP can only supply lowest 24 bits of the address.
A very similar problem prevented NetBSD 1.6.x boot, because NetBSD programmed the DMA controller and ESP in different order that what QEMU expected. This is now fixed, so it shouldn't be a problem anymore and since OBP works, the problem at hand is not in QEMU side.
Maybe Solaris also programs DMA and ESP in some different way, which causes DMA to fire before the address is set up. To fix it, maybe OpenBIOS should reset the DMA controller after using it or at least disable DMA.
Okay. Well I added some debugging to hw/sun4m_iommu.c to see what was happening and managed to get the following trace:
Welcome to OpenBIOS v1.0 built on Feb 19 2011 17:00 Type 'help' for detailed information
0 > boot cdrom:d -v Not a bootable ELF image Loading a.out image... Loaded 7680 bytes entry point is 0x4000 bootpath: /iommu/sbus/espdma/esp/sd@2,0:d
Jumping to entry point 00004000 for type 00000005... switching to new context: Size: 259040+54154+47486 Bytes SunOS Release 5.8 Version Generic_108528-09 32-bit Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved. Ethernet address = 52:54:0:12:34:56 Using default device instance data vac: enabled in write through mode mem = 131072K (0x8000000) avail mem = 110419968 ### Writing to iommu addr: 1 ### Setting IOMMU base addr: 6bc000 ### Writing to iommu addr: 0 ### Writing to iommu addr: 5 ### IOMMU TLB flush 0 root nexus = SUNW,SPARCstation-5 iommu0 at root: obio 0x10000000 sbus0 at iommu0: obio 0x10001000 dma0 at sbus0: SBus slot 5 0x8400000 dma0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000 ### Writing to iommu addr: 6 ### IOMMU page flush fc000000 /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 ### Writing to iommu addr: 6 ### IOMMU page flush fc001000 (crash cut)
So it looks like Solaris has already taken over the IOMMU page table by setting a new base address, and appears to be flushing 2 entries for 0xfc000000 and 0xfc001000 which looks like it should be doing the right thing. If Solaris has taken over the IOMMU page table, then how could OpenBIOS affect this?
If the DMA is launched by accident before the page tables are set up, that could cause the crash.
Please enable DMA debugging as well, that would tell us what is the programming sequence.
Also, what do the IOMMU_RNGE_*MB constants do? Do they control the size of the IOMMU page table? Sorry to keep asking these questions but it seems that Oracle have removed all of the documents referenced in the various IOMMU source files :(
Yes, for example with 64MB range, virtual DMA addresses between 0xfc000000 to 0xffffffff are available.
One possibility for crash is that OpenBIOS uses 64MB range, but Solaris "knows" that it is something else as used by OBP.
On 26/02/11 14:53, Blue Swirl wrote:
So it looks like Solaris has already taken over the IOMMU page table by setting a new base address, and appears to be flushing 2 entries for 0xfc000000 and 0xfc001000 which looks like it should be doing the right thing. If Solaris has taken over the IOMMU page table, then how could OpenBIOS affect this?
If the DMA is launched by accident before the page tables are set up, that could cause the crash.
Please enable DMA debugging as well, that would tell us what is the programming sequence.
Okay - I didn't find any DMA debugging, but I've re-enabled the debugging in hw/esp.c to give a feel for the order of things:
vac: enabled in write through mode mem = 131072K (0x8000000) avail mem = 110419968 ### Writing to iommu addr: 1 ### Setting IOMMU base addr: 6bc000 ### Writing to iommu addr: 0 ### Writing to iommu addr: 5 ### IOMMU TLB flush 0 root nexus = SUNW,SPARCstation-5 iommu0 at root: obio 0x10000000 sbus0 at iommu0: obio 0x10001000 dma0 at sbus0: SBus slot 5 0x8400000 dma0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000 ### Writing to iommu addr: 6 ### IOMMU page flush fc000000 ESP: write reg[11]: 0x00 -> 0x00 ESP: write reg[11]: 0x00 -> 0x0a ESP: read reg[11]: 0x0a ESP: write reg[12]: 0x00 -> 0x00 ESP: write reg[12]: 0x00 -> 0x05 ESP: read reg[12]: 0x05 ESP: write reg[11]: 0x0a -> 0x08 ESP: write reg[12]: 0x05 -> 0x00 ESP: write reg[3]: 0x10 -> 0x03 ESP: Bus reset (03) ESP: Raise IRQ ESP: Lower enable ESP: write reg[3]: 0x00 -> 0x02 ESP: Chip reset (02) ESP: write reg[3]: 0x02 -> 0x80 ESP: NOP (80) ESP: write reg[3]: 0x80 -> 0x80 ESP: NOP (80) ESP: write reg[9]: 0x00 -> 0x00 ESP: write reg[5]: 0x00 -> 0xa3 ESP: write reg[6]: 0x00 -> 0x00 ESP: write reg[7]: 0x00 -> 0x00 ESP: read reg[14]: 0x04 ESP: read reg[14]: 0x04 ESP: write reg[8]: 0x00 -> 0x17 ESP: write reg[12]: 0x00 -> 0x01 ESP: write reg[11]: 0x00 -> 0x08 /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 ESP: read reg[5]: 0x00 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 ### Writing to iommu addr: 6 ### IOMMU page flush fc001000 ESP: write reg[4]: 0x00 -> 0x00 ESP: write reg[6]: 0x00 -> 0x00 ESP: write reg[7]: 0x00 -> 0x00 ESP: write reg[12]: 0x01 -> 0x01 ESP: write reg[8]: 0x17 -> 0x07 ESP: write reg[0]: 0x00 -> 0x07 ESP: write reg[1]: 0x00 -> 0x00 ESP: Raise enable ESP: write reg[3]: 0x80 -> 0xc2 ESP: Select with ATN (c2) ESP: get_cmd: len 7 target 0 buf 0x7fff87f4ba90 ESP: ### No such drive! ESP: ### No such drive pause over! ESP: Raise IRQ qemu: fatal: Trap 0x29 while interrupts disabled, Error state
A flush should be the removal of an entry, so I would guess that Solaris flushes the relevant entries and then adds them back into its IOMMU page table manually?
Also, what do the IOMMU_RNGE_*MB constants do? Do they control the size of the IOMMU page table? Sorry to keep asking these questions but it seems that Oracle have removed all of the documents referenced in the various IOMMU source files :(
Yes, for example with 64MB range, virtual DMA addresses between 0xfc000000 to 0xffffffff are available.
One possibility for crash is that OpenBIOS uses 64MB range, but Solaris "knows" that it is something else as used by OBP.
AFAICT OpenBIOS sets a 32MB range, but some debugging shows that Solaris sets it to 64MB just before the scan - so I think this range should be valid, there is just no mapping in the page table for it.
ATB,
Mark.
On Sat, Feb 26, 2011 at 5:23 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
On 26/02/11 14:53, Blue Swirl wrote:
So it looks like Solaris has already taken over the IOMMU page table by setting a new base address, and appears to be flushing 2 entries for 0xfc000000 and 0xfc001000 which looks like it should be doing the right thing. If Solaris has taken over the IOMMU page table, then how could OpenBIOS affect this?
If the DMA is launched by accident before the page tables are set up, that could cause the crash.
Please enable DMA debugging as well, that would tell us what is the programming sequence.
Okay - I didn't find any DMA debugging, but I've re-enabled the debugging in hw/esp.c to give a feel for the order of things:
vac: enabled in write through mode mem = 131072K (0x8000000) avail mem = 110419968 ### Writing to iommu addr: 1 ### Setting IOMMU base addr: 6bc000 ### Writing to iommu addr: 0 ### Writing to iommu addr: 5 ### IOMMU TLB flush 0 root nexus = SUNW,SPARCstation-5 iommu0 at root: obio 0x10000000 sbus0 at iommu0: obio 0x10001000 dma0 at sbus0: SBus slot 5 0x8400000 dma0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000 ### Writing to iommu addr: 6 ### IOMMU page flush fc000000 ESP: write reg[11]: 0x00 -> 0x00 ESP: write reg[11]: 0x00 -> 0x0a ESP: read reg[11]: 0x0a ESP: write reg[12]: 0x00 -> 0x00 ESP: write reg[12]: 0x00 -> 0x05 ESP: read reg[12]: 0x05 ESP: write reg[11]: 0x0a -> 0x08 ESP: write reg[12]: 0x05 -> 0x00 ESP: write reg[3]: 0x10 -> 0x03 ESP: Bus reset (03) ESP: Raise IRQ ESP: Lower enable ESP: write reg[3]: 0x00 -> 0x02 ESP: Chip reset (02) ESP: write reg[3]: 0x02 -> 0x80 ESP: NOP (80) ESP: write reg[3]: 0x80 -> 0x80 ESP: NOP (80) ESP: write reg[9]: 0x00 -> 0x00 ESP: write reg[5]: 0x00 -> 0xa3 ESP: write reg[6]: 0x00 -> 0x00 ESP: write reg[7]: 0x00 -> 0x00 ESP: read reg[14]: 0x04 ESP: read reg[14]: 0x04 ESP: write reg[8]: 0x00 -> 0x17 ESP: write reg[12]: 0x00 -> 0x01 ESP: write reg[11]: 0x00 -> 0x08 /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 ESP: read reg[5]: 0x00 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 ### Writing to iommu addr: 6 ### IOMMU page flush fc001000 ESP: write reg[4]: 0x00 -> 0x00 ESP: write reg[6]: 0x00 -> 0x00 ESP: write reg[7]: 0x00 -> 0x00 ESP: write reg[12]: 0x01 -> 0x01 ESP: write reg[8]: 0x17 -> 0x07 ESP: write reg[0]: 0x00 -> 0x07 ESP: write reg[1]: 0x00 -> 0x00 ESP: Raise enable ESP: write reg[3]: 0x80 -> 0xc2 ESP: Select with ATN (c2) ESP: get_cmd: len 7 target 0 buf 0x7fff87f4ba90 ESP: ### No such drive! ESP: ### No such drive pause over! ESP: Raise IRQ qemu: fatal: Trap 0x29 while interrupts disabled, Error state
A flush should be the removal of an entry, so I would guess that Solaris flushes the relevant entries and then adds them back into its IOMMU page table manually?
The flush would just affect an internal TLB, which is not implemented in QEMU, so the flush has no effect.
On 26/02/11 16:36, Blue Swirl wrote:
A flush should be the removal of an entry, so I would guess that Solaris flushes the relevant entries and then adds them back into its IOMMU page table manually?
The flush would just affect an internal TLB, which is not implemented in QEMU, so the flush has no effect.
Right. So I added some extra code to QEMU again to get it to print the current translation for 0xfc000000 every time the IOMMU is accessed, and that looks like this:
Jumping to entry point 00004000 for type 00000005... switching to new context: Size: 259040+54154+47486 Bytes SunOS Release 5.8 Version Generic_108528-09 32-bit Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved. Ethernet address = 52:54:0:12:34:56 Using default device instance data vac: enabled in write through mode mem = 131072K (0x8000000) avail mem = 110419968 ### Writing to iommu addr: 1, value 6bc000 0 --- Current translation for 0xfc000000 is 7fba06 ### Setting IOMMU base addr: 6bc000 ### Writing to iommu addr: 0, value 9 0 --- Current translation for 0xfc000000 is 0 ### Writing to iommu addr: 5, value 0 0 --- Current translation for 0xfc000000 is 0 ### IOMMU TLB flush 0 root nexus = SUNW,SPARCstation-5 iommu0 at root: obio 0x10000000 sbus0 at iommu0: obio 0x10001000 dma0 at sbus0: SBus slot 5 0x8400000 dma0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000 ### Writing to iommu addr: 6, value fc000000 0 --- Current translation for 0xfc000000 is 0 ### IOMMU page flush fc000000 /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 ### Writing to iommu addr: 6, value fc001000 0 --- Current translation for 0xfc000000 is 0 ### IOMMU page flush fc001000 qemu: fatal: Trap 0x29 while interrupts disabled, Error state
So the existing translation put in place by OpenBIOS is fine until the the IOMMU page table is switched, at which point we die because it doesn't get replaced.
The fact that it is the 0xc2 (DMA enabled) command which is issued rather than the 0x42 non-DMA version is making this seem more of a QEMU bug as opposed to an OpenBIOS bug. Surely a kernel must set up its DMA buffers first before attempting a DMA command?
Artyom: can you definitely boot Solaris 8 under OBP? And in fact, can you boot any of your other test images using OpenBIOS SVN trunk?
Blue: what debugging do I need to add to prove that this is the DMA bug you mentioned before?
ATB,
Mark.
On Sat, Feb 26, 2011 at 6:52 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
On 26/02/11 16:36, Blue Swirl wrote:
A flush should be the removal of an entry, so I would guess that Solaris flushes the relevant entries and then adds them back into its IOMMU page table manually?
The flush would just affect an internal TLB, which is not implemented in QEMU, so the flush has no effect.
Right. So I added some extra code to QEMU again to get it to print the current translation for 0xfc000000 every time the IOMMU is accessed, and that looks like this:
Jumping to entry point 00004000 for type 00000005... switching to new context: Size: 259040+54154+47486 Bytes SunOS Release 5.8 Version Generic_108528-09 32-bit Copyright 1983-2001 Sun Microsystems, Inc. All rights reserved. Ethernet address = 52:54:0:12:34:56 Using default device instance data vac: enabled in write through mode mem = 131072K (0x8000000) avail mem = 110419968 ### Writing to iommu addr: 1, value 6bc000 0 --- Current translation for 0xfc000000 is 7fba06 ### Setting IOMMU base addr: 6bc000 ### Writing to iommu addr: 0, value 9 0 --- Current translation for 0xfc000000 is 0 ### Writing to iommu addr: 5, value 0 0 --- Current translation for 0xfc000000 is 0 ### IOMMU TLB flush 0 root nexus = SUNW,SPARCstation-5 iommu0 at root: obio 0x10000000 sbus0 at iommu0: obio 0x10001000 dma0 at sbus0: SBus slot 5 0x8400000 dma0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000 ### Writing to iommu addr: 6, value fc000000 0 --- Current translation for 0xfc000000 is 0 ### IOMMU page flush fc000000 /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 ### Writing to iommu addr: 6, value fc001000 0 --- Current translation for 0xfc000000 is 0 ### IOMMU page flush fc001000 qemu: fatal: Trap 0x29 while interrupts disabled, Error state
So the existing translation put in place by OpenBIOS is fine until the the IOMMU page table is switched, at which point we die because it doesn't get replaced.
The fact that it is the 0xc2 (DMA enabled) command which is issued rather than the 0x42 non-DMA version is making this seem more of a QEMU bug as opposed to an OpenBIOS bug. Surely a kernel must set up its DMA buffers first before attempting a DMA command?
Artyom: can you definitely boot Solaris 8 under OBP? And in fact, can you boot any of your other test images using OpenBIOS SVN trunk?
Blue: what debugging do I need to add to prove that this is the DMA bug you mentioned before?
Tracepoint logs with all points from both sparc32_dma.c and sun4m_iommu.c enabled?
On 26/02/11 17:28, Blue Swirl wrote:
Blue: what debugging do I need to add to prove that this is the DMA bug you mentioned before?
Tracepoint logs with all points from both sparc32_dma.c and sun4m_iommu.c enabled?
Here's the end of the log file with the relevant tracepoints enabled.
ATB,
Mark.
On 27/02/11 11:14, Mark Cave-Ayland wrote:
On 26/02/11 17:28, Blue Swirl wrote:
Blue: what debugging do I need to add to prove that this is the DMA bug you mentioned before?
Tracepoint logs with all points from both sparc32_dma.c and sun4m_iommu.c enabled?
Okay - as an alternative, here is the output of my hacked-up qemu with ESP debugging and OpenBIOS romvec debugging enabled:
vac: enabled in write through mode mem = 131072K (0x8000000) avail mem = 110419968 obp_nextnode(0x0) = 0xffd4527c obp_proplen(0xffd4527c, reg) (not found) obp_proplen(0xffd4527c, ranges) (not found) obp_proplen(0xffd4527c, intr) (not found) obp_proplen(0xffd4527c, interrupts) (not found) ### Writing to iommu addr: 1, value 6bc000 0 --- Current translation for 0xfc000000 is 7fba06 ### Setting IOMMU base addr: 6bc000 ### Writing to iommu addr: 0, value 9 0 --- Current translation for 0xfc000000 is 0 ### Writing to iommu addr: 5, value 0 0 --- Current translation for 0xfc000000 is 0 ### IOMMU TLB flush 0 obp_proplen(0xffd4527c, ttymodes) (not found) obp_proplen(0xffd4527c, device_type) (not found) root nexus = SUNW,SPARCstation-5 obp_proplen(0xffd4527c, pm-hardware-state) (not found) obp_proplen(0xffd4527c, pm-want-child-notification?) (not found) obp_proplen(0xffd4527c, pm-components) (not found) obp_proplen(0xffd455e4, reg) (not found) obp_proplen(0xffd455e4, ranges) (not found) obp_proplen(0xffd455e4, intr) (not found) obp_proplen(0xffd455e4, interrupts) (not found) obp_proplen(0xffd455e4, reg) (not found) obp_proplen(0xffd455e4, ranges) (not found) obp_proplen(0xffd455e4, intr) (not found) obp_proplen(0xffd455e4, interrupts) (not found) obp_proplen(0xffd455e4, device_type) (not found) obp_proplen(0xffd455e4, pm-hardware-state) (not found) obp_proplen(0xffd455e4, pm-want-child-notification?) (not found) obp_proplen(0xffd455e4, pm-components) (not found) obp_fortheval_v2(0 0 f024360c f59a88c8 11 ['] find-device catch if 2drop true else current-device device-end then swap l!) obp_proplen(0xffd4caa4, reg) = 12 obp_proplen(0xffd4caa4, reg) = 12 obp_getprop(0xffd4caa4, reg) = 00 00 00 00 10 00 00 00 00 00 03 00 obp_proplen(0xffd4caa4, ranges) (not found) obp_proplen(0xffd4caa4, intr) (not found) obp_proplen(0xffd4caa4, interrupts) (not found) obp_proplen(0xffd4caa4, device_type) (not found) iommu0 at root: obio 0x10000000 obp_proplen(0xffd4caa4, pm-want-child-notification?) (not found) obp_proplen(0xffd4caa4, pm-components) (not found) obp_fortheval_v2(0 0 f024360c f59a88c8 21 ['] find-device catch if 2drop true else current-device device-end then swap l!) obp_proplen(0xffd4cc14, reg) = 12 obp_proplen(0xffd4cc14, reg) = 12 obp_getprop(0xffd4cc14, reg) = 00 00 00 00 10 00 10 00 00 00 00 28 obp_proplen(0xffd4cc14, ranges) = 120 obp_proplen(0xffd4cc14, ranges) = 120 obp_getprop(0xffd4cc14, ranges) = 00 00 00 00 00 00 00 00 00 00 00 00 20 00 00 00 10 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 30 00 00 00 10 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 40 00 00 00 10 00 00 00 00 00 00 03 00 00 00 00 00 00 00 00 50 00 00 00 10 00 00 00 00 00 00 04 00 00 00 00 00 00 00 00 60 00 00 00 10 00 00 00 00 00 00 05 00 00 00 00 00 00 00 00 70 00 00 00 10 00 00 00 obp_proplen(0xffd4cc14, intr) (not found) obp_proplen(0xffd4cc14, interrupts) (not found) obp_proplen(0xffd4cc14, device_type) = 13 obp_proplen(0xffd4cc14, device_type) = 13 obp_getprop(0xffd4cc14, device_type) = hierarchical sbus0 at iommu0: obio 0x10001000 obp_proplen(0xffd4cc14, burst-sizes) = 4 obp_proplen(0xffd4cc14, burst-sizes) = 4 obp_getprop(0xffd4cc14, burst-sizes) = 0000003f obp_proplen(0xffd4cc14, pm-hardware-state) (not found) obp_proplen(0xffd4cc14, pm-want-child-notification?) (not found) obp_proplen(0xffd4cc14, pm-components) (not found) obp_fortheval_v2(0 0 f024360c f59a88c8 32 ['] find-device catch if 2drop true else current-device device-end then swap l!) obp_proplen(0xffd4cfec, reg) = 12 obp_proplen(0xffd4cfec, reg) = 12 obp_getprop(0xffd4cfec, reg) = 00 00 00 05 08 40 00 00 00 00 00 10 obp_proplen(0xffd4cfec, ranges) (not found) obp_proplen(0xffd4cfec, intr) (not found) obp_proplen(0xffd4cfec, interrupts) (not found) obp_proplen(0xffd4cfec, device_type) (not found) dma0 at sbus0: SBus slot 5 0x8400000 dma0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000 obp_proplen(0xffd4cfec, pm-hardware-state) (not found) obp_proplen(0xffd4cfec, pm-want-child-notification?) (not found) obp_proplen(0xffd4cfec, pm-components) (not found) obp_fortheval_v2(0 0 f024360c f59a88c8 40 ['] find-device catch if 2drop true else current-device device-end then swap l!) obp_proplen(0xffd4fac0, reg) = 12 obp_proplen(0xffd4fac0, reg) = 12 obp_getprop(0xffd4fac0, reg) = 00 00 00 05 08 80 00 00 00 00 00 10 obp_proplen(0xffd4fac0, ranges) (not found) obp_proplen(0xffd4fac0, intr) = 8 obp_proplen(0xffd4fac0, intr) = 8 obp_getprop(0xffd4fac0, intr) = 00 00 00 24 00 00 00 00 obp_proplen(0xffd4fac0, interrupts) (not found) obp_proplen(0xffd4fac0, device_type) = 5 obp_proplen(0xffd4fac0, device_type) = 5 obp_getprop(0xffd4fac0, device_type) = scsi obp_proplen(0xffd4cc14, slave-only) (not found) obp_proplen(0xffd4caa4, slave-only) (not found) obp_proplen(0xffd4527c, slave-only) (not found) obp_proplen(0xffd455e4, slave-only) (not found) ### Writing to iommu addr: 6, value fc000000 0 --- Current translation for 0xfc000000 is 0 ### IOMMU page flush fc000000 obp_proplen(0xffd4fac0, initiator-id) (not found) obp_proplen(0xffd4cfec, initiator-id) (not found) obp_proplen(0xffd4cc14, initiator-id) (not found) obp_proplen(0xffd4caa4, initiator-id) (not found) obp_proplen(0xffd4527c, initiator-id) (not found) obp_proplen(0xffd455e4, initiator-id) (not found) obp_proplen(0xffd4fac0, scsi-initiator-id) (not found) obp_proplen(0xffd4cfec, scsi-initiator-id) (not found) obp_proplen(0xffd4cc14, scsi-initiator-id) (not found) obp_proplen(0xffd4caa4, scsi-initiator-id) (not found) obp_proplen(0xffd4527c, scsi-initiator-id) (not found) obp_proplen(0xffd455e4, scsi-initiator-id) (not found) obp_proplen(0xffd4fac0, scsi-reset-delay) (not found) obp_proplen(0xffd4cfec, scsi-reset-delay) (not found) obp_proplen(0xffd4cc14, scsi-reset-delay) (not found) obp_proplen(0xffd4caa4, scsi-reset-delay) (not found) obp_proplen(0xffd4527c, scsi-reset-delay) (not found) obp_proplen(0xffd455e4, scsi-reset-delay) (not found) obp_proplen(0xffd4fac0, scsi-tag-age-limit) (not found) obp_proplen(0xffd4cfec, scsi-tag-age-limit) (not found) obp_proplen(0xffd4cc14, scsi-tag-age-limit) (not found) obp_proplen(0xffd4caa4, scsi-tag-age-limit) (not found) obp_proplen(0xffd4527c, scsi-tag-age-limit) (not found) obp_proplen(0xffd455e4, scsi-tag-age-limit) (not found) obp_proplen(0xffd4fac0, scsi-watchdog-tick) (not found) obp_proplen(0xffd4cfec, scsi-watchdog-tick) (not found) obp_proplen(0xffd4cc14, scsi-watchdog-tick) (not found) obp_proplen(0xffd4caa4, scsi-watchdog-tick) (not found) obp_proplen(0xffd4527c, scsi-watchdog-tick) (not found) obp_proplen(0xffd455e4, scsi-watchdog-tick) (not found) obp_proplen(0xffd4fac0, scsi-options) (not found) obp_proplen(0xffd4cfec, scsi-options) (not found) obp_proplen(0xffd4cc14, scsi-options) (not found) obp_proplen(0xffd4caa4, scsi-options) (not found) obp_proplen(0xffd4527c, scsi-options) (not found) obp_proplen(0xffd455e4, scsi-options) (not found) obp_proplen(0xffd4fac0, scsi-selection-timeout) (not found) obp_proplen(0xffd4cfec, scsi-selection-timeout) (not found) obp_proplen(0xffd4cc14, scsi-selection-timeout) (not found) obp_proplen(0xffd4caa4, scsi-selection-timeout) (not found) obp_proplen(0xffd4527c, scsi-selection-timeout) (not found) obp_proplen(0xffd455e4, scsi-selection-timeout) (not found) obp_proplen(0xffd4fac0, target0-scsi-options) (not found) obp_proplen(0xffd4cfec, target0-scsi-options) (not found) obp_proplen(0xffd4cc14, target0-scsi-options) (not found) obp_proplen(0xffd4caa4, target0-scsi-options) (not found) obp_proplen(0xffd4527c, target0-scsi-options) (not found) obp_proplen(0xffd455e4, target0-scsi-options) (not found) obp_proplen(0xffd4fac0, target1-scsi-options) (not found) obp_proplen(0xffd4cfec, target1-scsi-options) (not found) obp_proplen(0xffd4cc14, target1-scsi-options) (not found) obp_proplen(0xffd4caa4, target1-scsi-options) (not found) obp_proplen(0xffd4527c, target1-scsi-options) (not found) obp_proplen(0xffd455e4, target1-scsi-options) (not found) obp_proplen(0xffd4fac0, target2-scsi-options) (not found) obp_proplen(0xffd4cfec, target2-scsi-options) (not found) obp_proplen(0xffd4cc14, target2-scsi-options) (not found) obp_proplen(0xffd4caa4, target2-scsi-options) (not found) obp_proplen(0xffd4527c, target2-scsi-options) (not found) obp_proplen(0xffd455e4, target2-scsi-options) (not found) obp_proplen(0xffd4fac0, target3-scsi-options) (not found) obp_proplen(0xffd4cfec, target3-scsi-options) (not found) obp_proplen(0xffd4cc14, target3-scsi-options) (not found) obp_proplen(0xffd4caa4, target3-scsi-options) (not found) obp_proplen(0xffd4527c, target3-scsi-options) (not found) obp_proplen(0xffd455e4, target3-scsi-options) (not found) obp_proplen(0xffd4fac0, target4-scsi-options) (not found) obp_proplen(0xffd4cfec, target4-scsi-options) (not found) obp_proplen(0xffd4cc14, target4-scsi-options) (not found) obp_proplen(0xffd4caa4, target4-scsi-options) (not found) obp_proplen(0xffd4527c, target4-scsi-options) (not found) obp_proplen(0xffd455e4, target4-scsi-options) (not found) obp_proplen(0xffd4fac0, target5-scsi-options) (not found) obp_proplen(0xffd4cfec, target5-scsi-options) (not found) obp_proplen(0xffd4cc14, target5-scsi-options) (not found) obp_proplen(0xffd4caa4, target5-scsi-options) (not found) obp_proplen(0xffd4527c, target5-scsi-options) (not found) obp_proplen(0xffd455e4, target5-scsi-options) (not found) obp_proplen(0xffd4fac0, target6-scsi-options) (not found) obp_proplen(0xffd4cfec, target6-scsi-options) (not found) obp_proplen(0xffd4cc14, target6-scsi-options) (not found) obp_proplen(0xffd4caa4, target6-scsi-options) (not found) obp_proplen(0xffd4527c, target6-scsi-options) (not found) obp_proplen(0xffd455e4, target6-scsi-options) (not found) obp_proplen(0xffd4fac0, target7-scsi-options) (not found) obp_proplen(0xffd4cfec, target7-scsi-options) (not found) obp_proplen(0xffd4cc14, target7-scsi-options) (not found) obp_proplen(0xffd4caa4, target7-scsi-options) (not found) obp_proplen(0xffd4527c, target7-scsi-options) (not found) obp_proplen(0xffd455e4, target7-scsi-options) (not found) obp_proplen(0xffd4fac0, clock-frequency) = 4 obp_proplen(0xffd4fac0, clock-frequency) = 4 obp_getprop(0xffd4fac0, clock-frequency) = 02625a00 ESP: write reg[11]: 0x00 -> 0x00 ESP: write reg[11]: 0x00 -> 0x0a ESP: read reg[11]: 0x0a ESP: write reg[12]: 0x00 -> 0x00 ESP: write reg[12]: 0x00 -> 0x05 ESP: read reg[12]: 0x05 ESP: write reg[11]: 0x0a -> 0x08 ESP: write reg[12]: 0x05 -> 0x00 ESP: write reg[3]: 0x10 -> 0x03 ESP: Bus reset (03) ESP: Raise IRQ ESP: write reg[3]: 0x00 -> 0x02 ESP: Chip reset (02) ESP: write reg[3]: 0x02 -> 0x80 ESP: NOP (80) ESP: write reg[3]: 0x80 -> 0x80 ESP: NOP (80) ESP: write reg[9]: 0x00 -> 0x00 ESP: write reg[5]: 0x00 -> 0xa3 ESP: write reg[6]: 0x00 -> 0x00 ESP: write reg[7]: 0x00 -> 0x00 ESP: read reg[14]: 0x04 ESP: read reg[14]: 0x04 ESP: write reg[8]: 0x00 -> 0x17 ESP: write reg[12]: 0x00 -> 0x01 ESP: write reg[11]: 0x00 -> 0x08 obp_proplen(0xffd4fac0, esp-options) (not found) obp_proplen(0xffd4cfec, esp-options) (not found) obp_proplen(0xffd4cc14, esp-options) (not found) obp_proplen(0xffd4caa4, esp-options) (not found) obp_proplen(0xffd4527c, esp-options) (not found) obp_proplen(0xffd455e4, esp-options) (not found) /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 (esp0): esp-options=0x46 ESP: read reg[5]: 0x00 esp0 at dma0: SBus slot 5 0x8800000 sparc ipl 4 esp0 is /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000 obp_proplen(0xffd4fac0, pm-hardware-state) (not found) obp_proplen(0xffd4fac0, pm-want-child-notification?) (not found) obp_fortheval_v2(0 0 f024360c f59a88c8 47 ['] find-device catch if 2drop true else current-device device-end then swap l!) ### Writing to iommu addr: 6, value fc001000 0 --- Current translation for 0xfc000000 is 0 ### IOMMU page flush fc001000 ESP: write reg[4]: 0x00 -> 0x00 ESP: write reg[6]: 0x00 -> 0x00 ESP: write reg[7]: 0x00 -> 0x00 ESP: write reg[12]: 0x01 -> 0x01 ESP: write reg[8]: 0x17 -> 0x07 ESP: write reg[0]: 0x00 -> 0x07 ESP: write reg[1]: 0x00 -> 0x00 ==== Enabling DMA a4240210 ESP: Raise enable ESP: write reg[3]: 0x80 -> 0xc2 ESP: Select with ATN (c2) ESP: get_cmd: len 7 target 0 buf 0x7fff253fed90 ESP: ### No such drive! ESP: ### No such drive pause over! ESP: Raise IRQ qemu: fatal: Trap 0x29 while interrupts disabled, Error state
I've altered the qemu IOMMU code so that every time a new value is written to one of the control registers, we display the current translation for 0xfc000000 which should contain a valid mapping after the IOMMU TLB flush (or why else would the kernel want to flush the TLB?)
There is also extra code to show when the DMA_EN bit is set which prints "==== Enabling DMA" and the new value of the control register. Looking at the earlier NetBSD bug reports, it looks as if this is doing the right thing in that the DMA is only triggered once DMA_EN has been set.
At the moment I'm really struggling to find out why the IOMMU mapping isn't being setup, and since this is being managed by the Solaris kernel it's hard to see how OpenBIOS is having an effect on this (unless of course it is influenced indirectly by various OBP properties). Anyone have any more ideas?
ATB,
Mark.
On 19/03/11 12:11, Mark Cave-Ayland wrote:
At the moment I'm really struggling to find out why the IOMMU mapping isn't being setup, and since this is being managed by the Solaris kernel it's hard to see how OpenBIOS is having an effect on this (unless of course it is influenced indirectly by various OBP properties). Anyone have any more ideas?
Arrrgh - okay I've finally managed to figure this one out after a *lot* of hours. It seems that the romvec memory allocation routines have a physical alignment requirement. What was happening was that the value being written to the IOMMU base register was incorrectly aligned, and hence the value was being altered due to QEMU's hw/sun4m_iommu.c's IOMMU_BASE_MASK. Thus the mapping was happening but because the IOMMU page table base address was wrong, the mapping was being made at the wrong physical address in the IOMMU page table.
I currently have a "hack" fix to OpenBIOS which involves changes to both arch/sparc32/lib.c and OFMEM, but I think I can rework it so that I can get away with just changing lib.c. I'll try and post a patch later this evening.
ATB,
Mark.