On 2010-12-22 7:26 AM, Andreas Färber wrote:
As a general piece of advice I would suggest to turn on debugging for the CIF. I would assume that the client simply calls "claim", then something might be wrong with the mapping to ofmem. Or Solaris makes specific assumptions, which Tarl might be able to answer. Maybe you need to claim specific parts of memory during initial setup so that they are exempt.
I'm sure Solaris makes assumptions about memory regions, but they aren't documented anywhere I know of. OBP allocates physical memory from top downwards, and Solaris is supposed to avoid any OBP memory allocations by respecting the /memory and /virtual-memory nodes.
If we're running into a problem with virtual allocations, Solaris does "know" that OBP lives in F000.0000 and above. It expects to allocate specific virtual addresses for itself (and 0x10.0000 does sound right for the initial kernel address), but these should also be kept from conflicting by the /virtual-memory properties. What happens if a specific virtual address that it expects to use is already occupied is an interesting question.
Looking at the OFMEM trace, the claim seems to go just fine (last claim phys was given 0x13.5000, last claim virt took 0xf025.d000), it was after the map_page_range mapping that physical memory to virtual address 0xf025.d000 that we got stuck.
OFMEM: ofmem_claim phys=ffffffffffffffff size=00018000 align=00000008 OFMEM: ofmem_claim phys returned 000135000 OFMEM: ofmem_claim_virt virt=f025d000 size=00018000 align=00000000 OFMEM: ofmem_map_page_range f025d000 -> 000135000 00018000 mode 000000bc
That should be mapping f025.d000 through f027.5000 as virtual addresses of physical 13.5000 through 14.d000, which would certainly be suspicious in a Sun piece of hardware.
From 0xf000.0000 through 0xf010.000 is a copy of the OBP prom binary (starts with "OBMD"), followed by OBP expanding itself into memory. At f020.0000 is (again, in our systems), the uncompressed binary - I recall f020.0000 is the actual trap vector. On current Sun systems, trying to use f025.d000 would overwrite OBP itself, currently in the middle of fb8 video interface code.
Looking at stuff in a real system, I don't see "unoccupied" virtual addresses until about f040.0000 - if you can figure out who is coming up with the addresses of f025.d000 and (worse, earlier) f008.c000 , that might help figure out what's going on.