Re: [OpenBIOS] r638 - in trunk/openbios-devel/forth: bootstrap device

5 Dec 2009

On Sat, Dec 5, 2009 at 11:22 AM, Blue Swirl blauwirbel@gmail.com wrote:
...
On Sat, Dec 5, 2009 at 2:00 AM, Igor Kovalenko
igor.v.kovalenko@gmail.com wrote:
...
On Sat, Dec 5, 2009 at 1:37 AM, Nick Couchman Nick.Couchman@seakr.com wrote:
...
...
...
...
On 2009/12/04 at 14:43, Igor Kovalenko igor.v.kovalenko@gmail.com wrote:
On Fri, Dec 4, 2009 at 10:28 PM, Blue Swirl blauwirbel@gmail.com wrote:
...
On Fri, Dec 4, 2009 at 7:40 PM, Mark Cave-Ayland
mark.cave-ayland@siriusit.co.uk wrote:
...
Blue Swirl wrote:
>> Yup, that's a bug :-)
>
> Enabling DEBUG_MMU (and fixing the bugs...) confirms the MMU problem:
> DMMU dump:
> [46] VA: 8000000, PA: 0,   8k, user, RW, unlocked, ctx 0
> [53] VA: 8002000, PA: 0,   8k, user, RW, unlocked, ctx 0
> [54] VA: 8004000, PA: 0,   8k, user, RW, unlocked, ctx 0
>
> The physical address is the same (0) for all three VA entries. Why?
Thanks for the independent confirmation (I'll look out for related MMU fixes
in Qemu too).
I emailed Igor off-list for some pointers, and he suggested enabling
debugging in OpenBIOS by setting CONFIG_DEBUG_OFMEM and re-compiling, which
gives the following output for the 3 memory claims for 0x8000000, 0x8002000
and 0x8004000:
OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000
align=0000000000000001
OFMEM: ofmem_map_page_range 0000000000000000 -> 0000000000000000
0000000000002000 mode 0000000000000032
OFMEM: mapping mode altered virt=0000000000000000 old mode=0000000000000036
new mode=0000000000000032
OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000
align=0000000000000001
OFMEM: ofmem_map_page_range 0000000010000000 -> 0000000000000000
0000000000002000 mode 0000000000000032
OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000
align=0000000000000001
OFMEM: ofmem_map_page_range 0000000000002000 -> 0000000000002000
0000000000002000 mode 0000000000000032
OFMEM: mapping mode altered virt=0000000000002000 old mode=0000000000000036
new mode=0000000000000032
OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000
align=0000000000000001
OFMEM: ofmem_map_page_range 0000000008002000 -> 0000200000000000
0000000000002000 mode 0000000000000032
OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000
align=0000000000000001
OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000
align=0000000000000001
OFMEM: ofmem_map_page_range 0000000000004000 -> 0000000000004000
0000000000002000 mode 0000000000000032
OFMEM: mapping mode altered virt=0000000000004000 old mode=0000000000000036
new mode=0000000000000032
OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000
align=0000000000000001
OFMEM: ofmem_map_page_range 0000000008004000 -> 0000400000000000
0000000000002000 mode 0000000000000032
OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000
align=0000000000000001
With no memory option specified for Qemu, the default setting is 128Mb RAM
(which spans addresses 0 - 0x8000000). So looking at the above we can see
the following effects:
i) The base address for the virtual memory allocator is the top of physical
RAM. This doesn't seem too unreasonable.
ii) The first request allocated at 0x8000000 is mapped to physical RAM
location 0x0. This looks wrong, as normally low pages contain special
registers for various things - perhaps we need some kind of base offset
here?
iii) The second request allocated at 0x8002000 is mapped to physical RAM
location 0x200000000000. This is definitely wrong since the maximum physical
RAM location is 0x8000000.
iv) The third request allocated at 0x8004000 is mapped to physical RAM
location 0x400000000000. Again, this is definitely wrong since the maximum
physical RAM location is 0x8000000.
So this looks to me like an OpenBIOS issue related to finding physical
memory slots. I wonder if anything which is attempting to access memory over
the physical RAM limit is simply being forced to zero?
With this patch, I get a bit further:
entry point is 0x4000
Evaluating FCode...
invalid access of +vd
Can't mount root
byte-load: exception caught!
0 >
It actually stops earlier, before mmapping is performed.
You should not drop high part of phys addr (because it is passed as 2 cells)
but instead reverse the order in which we assemble address from hi/lo pair.
This makes it out of loop, but still no real progress further than that:
Index: openbios-devel/arch/sparc64/lib.c

--- openbios-devel.orig/arch/sparc64/lib.c
+++ openbios-devel/arch/sparc64/lib.c
@@ -260,9 +260,8 @@ mmu_map(void)
     mode = POP();
     size = POP();
     virt = POP();

phys = POP();
phys <<= 32;
phys |= POP();


phys = (POP() & 0xffffffff);
phys |= (POP() & 0xffffffff) << 32;

ofmem_map(phys, virt, size, mode);
 }
This still yields the following unhandled exception:
entry point is 0x4000
Evaluating FCode...
Unhandled Exception 0x0000000008000000
PC = 0x00000000ffd10e4c NPC = 0x00000000ffd10e50
Stopping execution
(gdb) l *0x00000000ffd10e4c
0xffd10e4c is in cfetch (../include/openbios/stack.h:34).
29      typedef ucell phandle_t;
30
31
32
33
34      static inline void PUSH(ucell value) {
35              dstack[++dstackcnt] = (value);
36      }
37      static inline void PUSH_xt( xt_t xt ) { PUSH( (ucell)xt ); }
38      static inline void PUSH_ih( ihandle_t ih ) { PUSH( (ucell)ih ); }
Right, exception is there still.
Though I found the real issue, return value of mem_claim has reversed
lo/hi cells of physical address.
The following patch fixes that, but r639 must be reverted. With this
change my HelenOS disk boots as before r639 (it does not use mem_claim
obviously, and mmu_map appears to be OK indeed.)
Signed-off-by: igor.v.kovalenko@gmail.com
Index: openbios-devel/arch/sparc64/lib.c
--- openbios-devel.orig/arch/sparc64/lib.c
+++ openbios-devel/arch/sparc64/lib.c
@@ -376,8 +376,8 @@ mem_claim( void )
ofmem_map(phys, phys, size, -1);

PUSH(phys >> 32);

PUSH(phys & 0xffffffffUL);

PUSH(phys >> 32);

}
/* ( phys size --- ) */
Shouldn't we then have something like the attached patch for consistency?
ACK. I missed mem_release physical address decoding.
...
Anyway I don't see any benefit compared to r639 and we lose the milaX progress.
We get out of loop compared to r638 I believe so there is still a progress.
-- 
Kind regards,
Igor V. Kovalenko