Re: [OpenBIOS] r638 - in trunk/openbios-devel/forth: bootstrap device - OpenBIOS

List overview All Threads
Download

newer

Re: [OpenBIOS] r638 - in trunk/openbios-devel/forth: bootstrap device

older

Cross-compiling and testing...

r640 -...

Nick Couchman

3 Dec 2009 3 Dec '09

10:06 p.m.

...

...
...
On 2009/12/03 at 08:47, svn@openbios.org wrote:

Author: mcayland Date: 2009-12-03 16:47:39 +0100 (Thu, 03 Dec 2009) New Revision: 638

Modified: trunk/openbios-devel/forth/bootstrap/bootstrap.fs trunk/openbios-devel/forth/device/fcode.fs Log: Fix backwards Fcode branches (bbranch and b?branch).

According to the specification, the destination for a backwards Fcode branch must be resolved from the bottom rather than the top of the cstack. The existing version of the code was simply doing a swap, and so nesting any branches within a backward branch would fail since the wrong destination would be resolved from the stack.

This patch adds a new cstack-startdepth variable to keep track of the cstack base location within an execution context (setup-tmp-comp and execute-tmp-comp) and alters the backward branches to make use of it.

With this patch in place, Milax under Qemu doesn't crash anymore but sits in an infinite loop reading sectors from the CDROM.

The Solaris Nevada code still gets the unhandled exception in cfetch.

-Nick

-------- This e-mail may contain confidential and privileged material for the sole use of the intended recipient. If this email is not intended for you, or you are not responsible for the delivery of this message to the intended recipient, please note that this message may contain SEAKR Engineering (SEAKR) Privileged/Proprietary Information. In such a case, you are strictly prohibited from downloading, photocopying, distributing or otherwise using this message, its contents or attachments in any way. If you have received this message in error, please notify us immediately by replying to this e-mail and delete the message from your mailbox. Information contained in this message that does not relate to the business of SEAKR is neither endorsed by nor attributable to SEAKR.

Show replies by date

Mark Cave-Ayland

4 Dec 4 Dec

12:10 a.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

Nick Couchman wrote:

...

The Solaris Nevada code still gets the unhandled exception in cfetch.

-Nick

Continuing to work on Milax, I've found out that the infinite loop bug I'm getting here is because of an MMU problem - two different virtual addresses appear to be mapped to the same physical address. If the memory management is broken then all sorts of random things are likely to break based upon their address, which is why I'm not surprised you're still seeing some failures. It's probably going to take a bit of time to debug :(

ATB,

Mark.

-- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs

Tarl Neustaedter

12:41 a.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

Mark Cave-Ayland wrote:

...

[...] Continuing to work on Milax, I've found out that the infinite loop bug I'm getting here is because of an MMU problem - two different virtual addresses appear to be mapped to the same physical address.

I believe that's legal. I know that OBP used to do that on sun4s systems.

Mark Cave-Ayland

1:46 a.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

Tarl Neustaedter wrote:

...

...
Continuing to work on Milax, I've found out that the infinite loop bug I'm getting here is because of an MMU problem - two different virtual addresses appear to be mapped to the same physical address.

I believe that's legal. I know that OBP used to do that on sun4s systems.

Oh that's interesting to know; however in this case I'm fairly confident it's broken :( What happens is that first the volume descriptor is read from the ISO image into a special buffer, and subsequent reads for file entries should then go into a different buffer. Unfortunately the subsequent reads into the second buffer seem to be mapped to the same memory location as the volume descriptor, and so they overwrite the volume descriptor causing things to get stuck in a tight loop because of incorrect descriptor values :(

I've actually verified this by changing a value in the second buffer and seeing the same change in the volume descriptor buffer...

ATB,

Mark.

Tarl Neustaedter

1:58 a.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

Mark Cave-Ayland wrote:

...

Tarl Neustaedter wrote:

...
...
Continuing to work on Milax, I've found out that the infinite loop bug I'm getting here is because of an MMU problem - two different virtual addresses appear to be mapped to the same physical address.

I believe that's legal. I know that OBP used to do that on sun4s systems.

Oh that's interesting to know; however in this case I'm fairly confident it's broken :( What happens is that first the volume descriptor is read from the ISO image into a special buffer, and subsequent reads for file entries should then go into a different buffer. Unfortunately the subsequent reads into the second buffer seem to be mapped to the same memory location

Ah. So the problem isn't that you have two virts->one phys, but that you expect your two virts to have different physical backing, and they don't.

Yup, that's a bug :-)

Blue Swirl

5:14 p.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

On Fri, Dec 4, 2009 at 2:58 AM, Tarl Neustaedter Tarl.Neustaedter@sun.com wrote:

...

Mark Cave-Ayland wrote:

...
Tarl Neustaedter wrote:

...
...
Continuing to work on Milax, I've found out that the infinite loop bug I'm getting here is because of an MMU problem - two different virtual addresses appear to be mapped to the same physical address.

I believe that's legal. I know that OBP used to do that on sun4s systems.

Oh that's interesting to know; however in this case I'm fairly confident it's broken :( What happens is that first the volume descriptor is read from the ISO image into a special buffer, and subsequent reads for file entries should then go into a different buffer. Unfortunately the subsequent reads into the second buffer seem to be mapped to the same memory location

Ah. So the problem isn't that you have two virts->one phys, but that you expect your two virts to have different physical backing, and they don't.

Yup, that's a bug :-)

Enabling DEBUG_MMU (and fixing the bugs...) confirms the MMU problem: DMMU dump: [46] VA: 8000000, PA: 0, 8k, user, RW, unlocked, ctx 0 [53] VA: 8002000, PA: 0, 8k, user, RW, unlocked, ctx 0 [54] VA: 8004000, PA: 0, 8k, user, RW, unlocked, ctx 0

The physical address is the same (0) for all three VA entries. Why?

Mark Cave-Ayland

6:40 p.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

Blue Swirl wrote:

...

...
Yup, that's a bug :-)

Enabling DEBUG_MMU (and fixing the bugs...) confirms the MMU problem: DMMU dump: [46] VA: 8000000, PA: 0, 8k, user, RW, unlocked, ctx 0 [53] VA: 8002000, PA: 0, 8k, user, RW, unlocked, ctx 0 [54] VA: 8004000, PA: 0, 8k, user, RW, unlocked, ctx 0

The physical address is the same (0) for all three VA entries. Why?

Thanks for the independent confirmation (I'll look out for related MMU fixes in Qemu too).

I emailed Igor off-list for some pointers, and he suggested enabling debugging in OpenBIOS by setting CONFIG_DEBUG_OFMEM and re-compiling, which gives the following output for the 3 memory claims for 0x8000000, 0x8002000 and 0x8004000:

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000000000 -> 0000000000000000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000000000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000010000000 -> 0000000000000000 0000000000002000 mode 0000000000000032

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000002000 -> 0000000000002000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000002000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008002000 -> 0000200000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000004000 -> 0000000000004000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000004000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008004000 -> 0000400000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

With no memory option specified for Qemu, the default setting is 128Mb RAM (which spans addresses 0 - 0x8000000). So looking at the above we can see the following effects:

i) The base address for the virtual memory allocator is the top of physical RAM. This doesn't seem too unreasonable.

ii) The first request allocated at 0x8000000 is mapped to physical RAM location 0x0. This looks wrong, as normally low pages contain special registers for various things - perhaps we need some kind of base offset here?

iii) The second request allocated at 0x8002000 is mapped to physical RAM location 0x200000000000. This is definitely wrong since the maximum physical RAM location is 0x8000000.

iv) The third request allocated at 0x8004000 is mapped to physical RAM location 0x400000000000. Again, this is definitely wrong since the maximum physical RAM location is 0x8000000.

So this looks to me like an OpenBIOS issue related to finding physical memory slots. I wonder if anything which is attempting to access memory over the physical RAM limit is simply being forced to zero?

ATB,

Mark.

Blue Swirl

8:28 p.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

On Fri, Dec 4, 2009 at 7:40 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:

...

Blue Swirl wrote:

...
...
Yup, that's a bug :-)

Enabling DEBUG_MMU (and fixing the bugs...) confirms the MMU problem: DMMU dump: [46] VA: 8000000, PA: 0, 8k, user, RW, unlocked, ctx 0 [53] VA: 8002000, PA: 0, 8k, user, RW, unlocked, ctx 0 [54] VA: 8004000, PA: 0, 8k, user, RW, unlocked, ctx 0

The physical address is the same (0) for all three VA entries. Why?

Thanks for the independent confirmation (I'll look out for related MMU fixes in Qemu too).

I emailed Igor off-list for some pointers, and he suggested enabling debugging in OpenBIOS by setting CONFIG_DEBUG_OFMEM and re-compiling, which gives the following output for the 3 memory claims for 0x8000000, 0x8002000 and 0x8004000:

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000000000 -> 0000000000000000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000000000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000010000000 -> 0000000000000000 0000000000002000 mode 0000000000000032

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000002000 -> 0000000000002000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000002000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008002000 -> 0000200000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000004000 -> 0000000000004000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000004000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008004000 -> 0000400000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

With no memory option specified for Qemu, the default setting is 128Mb RAM (which spans addresses 0 - 0x8000000). So looking at the above we can see the following effects:

i) The base address for the virtual memory allocator is the top of physical RAM. This doesn't seem too unreasonable.

ii) The first request allocated at 0x8000000 is mapped to physical RAM location 0x0. This looks wrong, as normally low pages contain special registers for various things - perhaps we need some kind of base offset here?

iii) The second request allocated at 0x8002000 is mapped to physical RAM location 0x200000000000. This is definitely wrong since the maximum physical RAM location is 0x8000000.

iv) The third request allocated at 0x8004000 is mapped to physical RAM location 0x400000000000. Again, this is definitely wrong since the maximum physical RAM location is 0x8000000.

So this looks to me like an OpenBIOS issue related to finding physical memory slots. I wonder if anything which is attempting to access memory over the physical RAM limit is simply being forced to zero?

With this patch, I get a bit further: entry point is 0x4000 Evaluating FCode... invalid access of +vd

Can't mount root

byte-load: exception caught!

0 >

Igor Kovalenko

10:43 p.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

On Fri, Dec 4, 2009 at 10:28 PM, Blue Swirl blauwirbel@gmail.com wrote:

...

On Fri, Dec 4, 2009 at 7:40 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:

...
Blue Swirl wrote:

...
...
Yup, that's a bug :-)

Enabling DEBUG_MMU (and fixing the bugs...) confirms the MMU problem: DMMU dump: [46] VA: 8000000, PA: 0, 8k, user, RW, unlocked, ctx 0 [53] VA: 8002000, PA: 0, 8k, user, RW, unlocked, ctx 0 [54] VA: 8004000, PA: 0, 8k, user, RW, unlocked, ctx 0

The physical address is the same (0) for all three VA entries. Why?

Thanks for the independent confirmation (I'll look out for related MMU fixes in Qemu too).

I emailed Igor off-list for some pointers, and he suggested enabling debugging in OpenBIOS by setting CONFIG_DEBUG_OFMEM and re-compiling, which gives the following output for the 3 memory claims for 0x8000000, 0x8002000 and 0x8004000:

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000000000 -> 0000000000000000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000000000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000010000000 -> 0000000000000000 0000000000002000 mode 0000000000000032

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000002000 -> 0000000000002000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000002000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008002000 -> 0000200000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000004000 -> 0000000000004000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000004000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008004000 -> 0000400000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

With no memory option specified for Qemu, the default setting is 128Mb RAM (which spans addresses 0 - 0x8000000). So looking at the above we can see the following effects:

i) The base address for the virtual memory allocator is the top of physical RAM. This doesn't seem too unreasonable.

ii) The first request allocated at 0x8000000 is mapped to physical RAM location 0x0. This looks wrong, as normally low pages contain special registers for various things - perhaps we need some kind of base offset here?

iii) The second request allocated at 0x8002000 is mapped to physical RAM location 0x200000000000. This is definitely wrong since the maximum physical RAM location is 0x8000000.

iv) The third request allocated at 0x8004000 is mapped to physical RAM location 0x400000000000. Again, this is definitely wrong since the maximum physical RAM location is 0x8000000.

So this looks to me like an OpenBIOS issue related to finding physical memory slots. I wonder if anything which is attempting to access memory over the physical RAM limit is simply being forced to zero?

With this patch, I get a bit further: entry point is 0x4000 Evaluating FCode... invalid access of +vd

Can't mount root

byte-load: exception caught!

0 >

It actually stops earlier, before mmapping is performed. You should not drop high part of phys addr (because it is passed as 2 cells) but instead reverse the order in which we assemble address from hi/lo pair.

This makes it out of loop, but still no real progress further than that:

Index: openbios-devel/arch/sparc64/lib.c =================================================================== --- openbios-devel.orig/arch/sparc64/lib.c +++ openbios-devel/arch/sparc64/lib.c @@ -260,9 +260,8 @@ mmu_map(void) mode = POP(); size = POP(); virt = POP(); - phys = POP(); - phys <<= 32; - phys |= POP(); + phys = (POP() & 0xffffffff); + phys |= (POP() & 0xffffffff) << 32;

ofmem_map(phys, virt, size, mode); }

-- Kind regards, Igor V. Kovalenko

Nick Couchman

11:37 p.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

...

...
...
On 2009/12/04 at 14:43, Igor Kovalenko igor.v.kovalenko@gmail.com wrote:

On Fri, Dec 4, 2009 at 10:28 PM, Blue Swirl blauwirbel@gmail.com wrote:

...
On Fri, Dec 4, 2009 at 7:40 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:

...
Blue Swirl wrote:

...
...
Yup, that's a bug :-)

Enabling DEBUG_MMU (and fixing the bugs...) confirms the MMU problem: DMMU dump: [46] VA: 8000000, PA: 0, 8k, user, RW, unlocked, ctx 0 [53] VA: 8002000, PA: 0, 8k, user, RW, unlocked, ctx 0 [54] VA: 8004000, PA: 0, 8k, user, RW, unlocked, ctx 0

The physical address is the same (0) for all three VA entries. Why?

Thanks for the independent confirmation (I'll look out for related MMU fixes in Qemu too).

I emailed Igor off-list for some pointers, and he suggested enabling debugging in OpenBIOS by setting CONFIG_DEBUG_OFMEM and re-compiling, which gives the following output for the 3 memory claims for 0x8000000, 0x8002000 and 0x8004000:

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000000000 -> 0000000000000000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000000000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000010000000 -> 0000000000000000 0000000000002000 mode 0000000000000032

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000002000 -> 0000000000002000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000002000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008002000 -> 0000200000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000004000 -> 0000000000004000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000004000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008004000 -> 0000400000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

With no memory option specified for Qemu, the default setting is 128Mb RAM (which spans addresses 0 - 0x8000000). So looking at the above we can see the following effects:

i) The base address for the virtual memory allocator is the top of physical RAM. This doesn't seem too unreasonable.

ii) The first request allocated at 0x8000000 is mapped to physical RAM location 0x0. This looks wrong, as normally low pages contain special registers for various things - perhaps we need some kind of base offset here?

iii) The second request allocated at 0x8002000 is mapped to physical RAM location 0x200000000000. This is definitely wrong since the maximum physical RAM location is 0x8000000.

iv) The third request allocated at 0x8004000 is mapped to physical RAM location 0x400000000000. Again, this is definitely wrong since the maximum physical RAM location is 0x8000000.

So this looks to me like an OpenBIOS issue related to finding physical memory slots. I wonder if anything which is attempting to access memory over the physical RAM limit is simply being forced to zero?

With this patch, I get a bit further: entry point is 0x4000 Evaluating FCode... invalid access of +vd

Can't mount root

byte-load: exception caught!

0 >

It actually stops earlier, before mmapping is performed. You should not drop high part of phys addr (because it is passed as 2 cells) but instead reverse the order in which we assemble address from hi/lo pair.

This makes it out of loop, but still no real progress further than that:

Index: openbios-devel/arch/sparc64/lib.c

--- openbios-devel.orig/arch/sparc64/lib.c +++ openbios-devel/arch/sparc64/lib.c @@ -260,9 +260,8 @@ mmu_map(void) mode = POP(); size = POP(); virt = POP();

phys = POP();

phys <<= 32;

phys |= POP();

phys = (POP() & 0xffffffff);

phys |= (POP() & 0xffffffff) << 32;

ofmem_map(phys, virt, size, mode);

}

This still yields the following unhandled exception:

entry point is 0x4000 Evaluating FCode... Unhandled Exception 0x0000000008000000 PC = 0x00000000ffd10e4c NPC = 0x00000000ffd10e50 Stopping execution

(gdb) l *0x00000000ffd10e4c 0xffd10e4c is in cfetch (../include/openbios/stack.h:34). 29 typedef ucell phandle_t; 30 31 32 33 34 static inline void PUSH(ucell value) { 35 dstack[++dstackcnt] = (value); 36 } 37 static inline void PUSH_xt( xt_t xt ) { PUSH( (ucell)xt ); } 38 static inline void PUSH_ih( ihandle_t ih ) { PUSH( (ucell)ih ); }

-Nick

Igor Kovalenko

5 Dec 5 Dec

1 a.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

On Sat, Dec 5, 2009 at 1:37 AM, Nick Couchman Nick.Couchman@seakr.com wrote:

...

...
...
...
On 2009/12/04 at 14:43, Igor Kovalenko igor.v.kovalenko@gmail.com wrote:

On Fri, Dec 4, 2009 at 10:28 PM, Blue Swirl blauwirbel@gmail.com wrote:

...
On Fri, Dec 4, 2009 at 7:40 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:

...
Blue Swirl wrote:

...
...
Yup, that's a bug :-)

Enabling DEBUG_MMU (and fixing the bugs...) confirms the MMU problem: DMMU dump: [46] VA: 8000000, PA: 0, 8k, user, RW, unlocked, ctx 0 [53] VA: 8002000, PA: 0, 8k, user, RW, unlocked, ctx 0 [54] VA: 8004000, PA: 0, 8k, user, RW, unlocked, ctx 0

The physical address is the same (0) for all three VA entries. Why?

Thanks for the independent confirmation (I'll look out for related MMU fixes in Qemu too).

I emailed Igor off-list for some pointers, and he suggested enabling debugging in OpenBIOS by setting CONFIG_DEBUG_OFMEM and re-compiling, which gives the following output for the 3 memory claims for 0x8000000, 0x8002000 and 0x8004000:

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000000000 -> 0000000000000000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000000000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000010000000 -> 0000000000000000 0000000000002000 mode 0000000000000032

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000002000 -> 0000000000002000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000002000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008002000 -> 0000200000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000004000 -> 0000000000004000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000004000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008004000 -> 0000400000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

With no memory option specified for Qemu, the default setting is 128Mb RAM (which spans addresses 0 - 0x8000000). So looking at the above we can see the following effects:

i) The base address for the virtual memory allocator is the top of physical RAM. This doesn't seem too unreasonable.

ii) The first request allocated at 0x8000000 is mapped to physical RAM location 0x0. This looks wrong, as normally low pages contain special registers for various things - perhaps we need some kind of base offset here?

iii) The second request allocated at 0x8002000 is mapped to physical RAM location 0x200000000000. This is definitely wrong since the maximum physical RAM location is 0x8000000.

iv) The third request allocated at 0x8004000 is mapped to physical RAM location 0x400000000000. Again, this is definitely wrong since the maximum physical RAM location is 0x8000000.

So this looks to me like an OpenBIOS issue related to finding physical memory slots. I wonder if anything which is attempting to access memory over the physical RAM limit is simply being forced to zero?

With this patch, I get a bit further: entry point is 0x4000 Evaluating FCode... invalid access of +vd

Can't mount root

byte-load: exception caught!

0 >

It actually stops earlier, before mmapping is performed. You should not drop high part of phys addr (because it is passed as 2 cells) but instead reverse the order in which we assemble address from hi/lo pair.

This makes it out of loop, but still no real progress further than that:

Index: openbios-devel/arch/sparc64/lib.c

--- openbios-devel.orig/arch/sparc64/lib.c +++ openbios-devel/arch/sparc64/lib.c @@ -260,9 +260,8 @@ mmu_map(void) mode = POP(); size = POP(); virt = POP();

phys = POP();

phys <<= 32;

phys |= POP();

phys = (POP() & 0xffffffff);

phys |= (POP() & 0xffffffff) << 32;

ofmem_map(phys, virt, size, mode); }

This still yields the following unhandled exception:

entry point is 0x4000 Evaluating FCode... Unhandled Exception 0x0000000008000000 PC = 0x00000000ffd10e4c NPC = 0x00000000ffd10e50 Stopping execution

(gdb) l *0x00000000ffd10e4c 0xffd10e4c is in cfetch (../include/openbios/stack.h:34). 29 typedef ucell phandle_t; 30 31 32 33 34 static inline void PUSH(ucell value) { 35 dstack[++dstackcnt] = (value); 36 } 37 static inline void PUSH_xt( xt_t xt ) { PUSH( (ucell)xt ); } 38 static inline void PUSH_ih( ihandle_t ih ) { PUSH( (ucell)ih ); }

Right, exception is there still. Though I found the real issue, return value of mem_claim has reversed lo/hi cells of physical address. The following patch fixes that, but r639 must be reverted. With this change my HelenOS disk boots as before r639 (it does not use mem_claim obviously, and mmu_map appears to be OK indeed.)

Signed-off-by: igor.v.kovalenko@gmail.com

ofmem_map(phys, phys, size, -1);

- PUSH(phys >> 32); PUSH(phys & 0xffffffffUL); + PUSH(phys >> 32); }

/* ( phys size --- ) */

-- Kind regards, Igor V. Kovalenko

Blue Swirl

9:22 a.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

On Sat, Dec 5, 2009 at 2:00 AM, Igor Kovalenko igor.v.kovalenko@gmail.com wrote:

...

On Sat, Dec 5, 2009 at 1:37 AM, Nick Couchman Nick.Couchman@seakr.com wrote:

...
...
...
...
On 2009/12/04 at 14:43, Igor Kovalenko igor.v.kovalenko@gmail.com wrote:

On Fri, Dec 4, 2009 at 10:28 PM, Blue Swirl blauwirbel@gmail.com wrote:

...
On Fri, Dec 4, 2009 at 7:40 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:

...
Blue Swirl wrote:

...
> Yup, that's a bug :-)

Enabling DEBUG_MMU (and fixing the bugs...) confirms the MMU problem: DMMU dump: [46] VA: 8000000, PA: 0, 8k, user, RW, unlocked, ctx 0 [53] VA: 8002000, PA: 0, 8k, user, RW, unlocked, ctx 0 [54] VA: 8004000, PA: 0, 8k, user, RW, unlocked, ctx 0

The physical address is the same (0) for all three VA entries. Why?

Thanks for the independent confirmation (I'll look out for related MMU fixes in Qemu too).

I emailed Igor off-list for some pointers, and he suggested enabling debugging in OpenBIOS by setting CONFIG_DEBUG_OFMEM and re-compiling, which gives the following output for the 3 memory claims for 0x8000000, 0x8002000 and 0x8004000:

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000000000 -> 0000000000000000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000000000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000010000000 -> 0000000000000000 0000000000002000 mode 0000000000000032

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000002000 -> 0000000000002000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000002000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008002000 -> 0000200000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000004000 -> 0000000000004000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000004000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008004000 -> 0000400000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

With no memory option specified for Qemu, the default setting is 128Mb RAM (which spans addresses 0 - 0x8000000). So looking at the above we can see the following effects:

i) The base address for the virtual memory allocator is the top of physical RAM. This doesn't seem too unreasonable.

ii) The first request allocated at 0x8000000 is mapped to physical RAM location 0x0. This looks wrong, as normally low pages contain special registers for various things - perhaps we need some kind of base offset here?

iii) The second request allocated at 0x8002000 is mapped to physical RAM location 0x200000000000. This is definitely wrong since the maximum physical RAM location is 0x8000000.

iv) The third request allocated at 0x8004000 is mapped to physical RAM location 0x400000000000. Again, this is definitely wrong since the maximum physical RAM location is 0x8000000.

So this looks to me like an OpenBIOS issue related to finding physical memory slots. I wonder if anything which is attempting to access memory over the physical RAM limit is simply being forced to zero?

With this patch, I get a bit further: entry point is 0x4000 Evaluating FCode... invalid access of +vd

Can't mount root

byte-load: exception caught!

0 >

It actually stops earlier, before mmapping is performed. You should not drop high part of phys addr (because it is passed as 2 cells) but instead reverse the order in which we assemble address from hi/lo pair.

This makes it out of loop, but still no real progress further than that:

Index: openbios-devel/arch/sparc64/lib.c

--- openbios-devel.orig/arch/sparc64/lib.c +++ openbios-devel/arch/sparc64/lib.c @@ -260,9 +260,8 @@ mmu_map(void) mode = POP(); size = POP(); virt = POP();

phys = POP();

phys <<= 32;

phys |= POP();

phys = (POP() & 0xffffffff);

phys |= (POP() & 0xffffffff) << 32;

ofmem_map(phys, virt, size, mode); }

This still yields the following unhandled exception:

entry point is 0x4000 Evaluating FCode... Unhandled Exception 0x0000000008000000 PC = 0x00000000ffd10e4c NPC = 0x00000000ffd10e50 Stopping execution

(gdb) l *0x00000000ffd10e4c 0xffd10e4c is in cfetch (../include/openbios/stack.h:34). 29 typedef ucell phandle_t; 30 31 32 33 34 static inline void PUSH(ucell value) { 35 dstack[++dstackcnt] = (value); 36 } 37 static inline void PUSH_xt( xt_t xt ) { PUSH( (ucell)xt ); } 38 static inline void PUSH_ih( ihandle_t ih ) { PUSH( (ucell)ih ); }

Right, exception is there still. Though I found the real issue, return value of mem_claim has reversed lo/hi cells of physical address. The following patch fixes that, but r639 must be reverted. With this change my HelenOS disk boots as before r639 (it does not use mem_claim obviously, and mmu_map appears to be OK indeed.)

Signed-off-by: igor.v.kovalenko@gmail.com

Index: openbios-devel/arch/sparc64/lib.c

--- openbios-devel.orig/arch/sparc64/lib.c +++ openbios-devel/arch/sparc64/lib.c @@ -376,8 +376,8 @@ mem_claim( void )

ofmem_map(phys, phys, size, -1);

PUSH(phys >> 32);

PUSH(phys & 0xffffffffUL);

PUSH(phys >> 32);

}

/* ( phys size --- ) */

Shouldn't we then have something like the attached patch for consistency?

Anyway I don't see any benefit compared to r639 and we lose the milaX progress.

Igor Kovalenko

10:45 a.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

On Sat, Dec 5, 2009 at 11:22 AM, Blue Swirl blauwirbel@gmail.com wrote:

...

On Sat, Dec 5, 2009 at 2:00 AM, Igor Kovalenko igor.v.kovalenko@gmail.com wrote:

...
On Sat, Dec 5, 2009 at 1:37 AM, Nick Couchman Nick.Couchman@seakr.com wrote:

...
...
...
...
On 2009/12/04 at 14:43, Igor Kovalenko igor.v.kovalenko@gmail.com wrote:

On Fri, Dec 4, 2009 at 10:28 PM, Blue Swirl blauwirbel@gmail.com wrote:

...
On Fri, Dec 4, 2009 at 7:40 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:

...
Blue Swirl wrote:

>> Yup, that's a bug :-) > > Enabling DEBUG_MMU (and fixing the bugs...) confirms the MMU problem: > DMMU dump: > [46] VA: 8000000, PA: 0, 8k, user, RW, unlocked, ctx 0 > [53] VA: 8002000, PA: 0, 8k, user, RW, unlocked, ctx 0 > [54] VA: 8004000, PA: 0, 8k, user, RW, unlocked, ctx 0 > > The physical address is the same (0) for all three VA entries. Why?

Thanks for the independent confirmation (I'll look out for related MMU fixes in Qemu too).

I emailed Igor off-list for some pointers, and he suggested enabling debugging in OpenBIOS by setting CONFIG_DEBUG_OFMEM and re-compiling, which gives the following output for the 3 memory claims for 0x8000000, 0x8002000 and 0x8004000:

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000000000 -> 0000000000000000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000000000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000010000000 -> 0000000000000000 0000000000002000 mode 0000000000000032

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000002000 -> 0000000000002000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000002000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008002000 -> 0000200000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000000004000 -> 0000000000004000 0000000000002000 mode 0000000000000032 OFMEM: mapping mode altered virt=0000000000004000 old mode=0000000000000036 new mode=0000000000000032 OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 align=0000000000000001 OFMEM: ofmem_map_page_range 0000000008004000 -> 0000400000000000 0000000000002000 mode 0000000000000032 OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 align=0000000000000001

With no memory option specified for Qemu, the default setting is 128Mb RAM (which spans addresses 0 - 0x8000000). So looking at the above we can see the following effects:

i) The base address for the virtual memory allocator is the top of physical RAM. This doesn't seem too unreasonable.

ii) The first request allocated at 0x8000000 is mapped to physical RAM location 0x0. This looks wrong, as normally low pages contain special registers for various things - perhaps we need some kind of base offset here?

iii) The second request allocated at 0x8002000 is mapped to physical RAM location 0x200000000000. This is definitely wrong since the maximum physical RAM location is 0x8000000.

iv) The third request allocated at 0x8004000 is mapped to physical RAM location 0x400000000000. Again, this is definitely wrong since the maximum physical RAM location is 0x8000000.

So this looks to me like an OpenBIOS issue related to finding physical memory slots. I wonder if anything which is attempting to access memory over the physical RAM limit is simply being forced to zero?

With this patch, I get a bit further: entry point is 0x4000 Evaluating FCode... invalid access of +vd

Can't mount root

byte-load: exception caught!

0 >

It actually stops earlier, before mmapping is performed. You should not drop high part of phys addr (because it is passed as 2 cells) but instead reverse the order in which we assemble address from hi/lo pair.

This makes it out of loop, but still no real progress further than that:

Index: openbios-devel/arch/sparc64/lib.c

--- openbios-devel.orig/arch/sparc64/lib.c +++ openbios-devel/arch/sparc64/lib.c @@ -260,9 +260,8 @@ mmu_map(void) mode = POP(); size = POP(); virt = POP();

phys = POP();

phys <<= 32;

phys |= POP();

phys = (POP() & 0xffffffff);

phys |= (POP() & 0xffffffff) << 32;

ofmem_map(phys, virt, size, mode); }

This still yields the following unhandled exception:

entry point is 0x4000 Evaluating FCode... Unhandled Exception 0x0000000008000000 PC = 0x00000000ffd10e4c NPC = 0x00000000ffd10e50 Stopping execution

(gdb) l *0x00000000ffd10e4c 0xffd10e4c is in cfetch (../include/openbios/stack.h:34). 29 typedef ucell phandle_t; 30 31 32 33 34 static inline void PUSH(ucell value) { 35 dstack[++dstackcnt] = (value); 36 } 37 static inline void PUSH_xt( xt_t xt ) { PUSH( (ucell)xt ); } 38 static inline void PUSH_ih( ihandle_t ih ) { PUSH( (ucell)ih ); }

Right, exception is there still. Though I found the real issue, return value of mem_claim has reversed lo/hi cells of physical address. The following patch fixes that, but r639 must be reverted. With this change my HelenOS disk boots as before r639 (it does not use mem_claim obviously, and mmu_map appears to be OK indeed.)

Signed-off-by: igor.v.kovalenko@gmail.com

Index: openbios-devel/arch/sparc64/lib.c

--- openbios-devel.orig/arch/sparc64/lib.c +++ openbios-devel/arch/sparc64/lib.c @@ -376,8 +376,8 @@ mem_claim( void )

ofmem_map(phys, phys, size, -1);

PUSH(phys >> 32);

PUSH(phys & 0xffffffffUL);

PUSH(phys >> 32);

}

/* ( phys size --- ) */

Shouldn't we then have something like the attached patch for consistency?

ACK. I missed mem_release physical address decoding.

...

Anyway I don't see any benefit compared to r639 and we lose the milaX progress.

We get out of loop compared to r638 I believe so there is still a progress.

-- Kind regards, Igor V. Kovalenko

Blue Swirl

11:21 a.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

On Sat, Dec 5, 2009 at 11:45 AM, Igor Kovalenko igor.v.kovalenko@gmail.com wrote:

...

On Sat, Dec 5, 2009 at 11:22 AM, Blue Swirl blauwirbel@gmail.com wrote:

...
On Sat, Dec 5, 2009 at 2:00 AM, Igor Kovalenko igor.v.kovalenko@gmail.com wrote:

...
On Sat, Dec 5, 2009 at 1:37 AM, Nick Couchman Nick.Couchman@seakr.com wrote:

...
...
...
> On 2009/12/04 at 14:43, Igor Kovalenko igor.v.kovalenko@gmail.com wrote:

On Fri, Dec 4, 2009 at 10:28 PM, Blue Swirl blauwirbel@gmail.com wrote:

...
On Fri, Dec 4, 2009 at 7:40 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote: > Blue Swirl wrote: > >>> Yup, that's a bug :-) >> >> Enabling DEBUG_MMU (and fixing the bugs...) confirms the MMU problem: >> DMMU dump: >> [46] VA: 8000000, PA: 0, 8k, user, RW, unlocked, ctx 0 >> [53] VA: 8002000, PA: 0, 8k, user, RW, unlocked, ctx 0 >> [54] VA: 8004000, PA: 0, 8k, user, RW, unlocked, ctx 0 >> >> The physical address is the same (0) for all three VA entries. Why? > > Thanks for the independent confirmation (I'll look out for related MMU fixes > in Qemu too). > > I emailed Igor off-list for some pointers, and he suggested enabling > debugging in OpenBIOS by setting CONFIG_DEBUG_OFMEM and re-compiling, which > gives the following output for the 3 memory claims for 0x8000000, 0x8002000 > and 0x8004000: > > > OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 > align=0000000000000001 > OFMEM: ofmem_map_page_range 0000000000000000 -> 0000000000000000 > 0000000000002000 mode 0000000000000032 > OFMEM: mapping mode altered virt=0000000000000000 old mode=0000000000000036 > new mode=0000000000000032 > OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 > align=0000000000000001 > OFMEM: ofmem_map_page_range 0000000010000000 -> 0000000000000000 > 0000000000002000 mode 0000000000000032 > > OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 > align=0000000000000001 > OFMEM: ofmem_map_page_range 0000000000002000 -> 0000000000002000 > 0000000000002000 mode 0000000000000032 > OFMEM: mapping mode altered virt=0000000000002000 old mode=0000000000000036 > new mode=0000000000000032 > OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 > align=0000000000000001 > OFMEM: ofmem_map_page_range 0000000008002000 -> 0000200000000000 > 0000000000002000 mode 0000000000000032 > OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 > align=0000000000000001 > > OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 > align=0000000000000001 > OFMEM: ofmem_map_page_range 0000000000004000 -> 0000000000004000 > 0000000000002000 mode 0000000000000032 > OFMEM: mapping mode altered virt=0000000000004000 old mode=0000000000000036 > new mode=0000000000000032 > OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000002000 > align=0000000000000001 > OFMEM: ofmem_map_page_range 0000000008004000 -> 0000400000000000 > 0000000000002000 mode 0000000000000032 > OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000002000 > align=0000000000000001 > > > With no memory option specified for Qemu, the default setting is 128Mb RAM > (which spans addresses 0 - 0x8000000). So looking at the above we can see > the following effects: > > > i) The base address for the virtual memory allocator is the top of physical > RAM. This doesn't seem too unreasonable. > > ii) The first request allocated at 0x8000000 is mapped to physical RAM > location 0x0. This looks wrong, as normally low pages contain special > registers for various things - perhaps we need some kind of base offset > here? > > iii) The second request allocated at 0x8002000 is mapped to physical RAM > location 0x200000000000. This is definitely wrong since the maximum physical > RAM location is 0x8000000. > > iv) The third request allocated at 0x8004000 is mapped to physical RAM > location 0x400000000000. Again, this is definitely wrong since the maximum > physical RAM location is 0x8000000. > > > So this looks to me like an OpenBIOS issue related to finding physical > memory slots. I wonder if anything which is attempting to access memory over > the physical RAM limit is simply being forced to zero?

With this patch, I get a bit further: entry point is 0x4000 Evaluating FCode... invalid access of +vd

Can't mount root

byte-load: exception caught!

0 >

It actually stops earlier, before mmapping is performed. You should not drop high part of phys addr (because it is passed as 2 cells) but instead reverse the order in which we assemble address from hi/lo pair.

This makes it out of loop, but still no real progress further than that:

Index: openbios-devel/arch/sparc64/lib.c

--- openbios-devel.orig/arch/sparc64/lib.c +++ openbios-devel/arch/sparc64/lib.c @@ -260,9 +260,8 @@ mmu_map(void) mode = POP(); size = POP(); virt = POP();

phys = POP();

phys <<= 32;

phys |= POP();

phys = (POP() & 0xffffffff);

phys |= (POP() & 0xffffffff) << 32;

ofmem_map(phys, virt, size, mode); }

This still yields the following unhandled exception:

entry point is 0x4000 Evaluating FCode... Unhandled Exception 0x0000000008000000 PC = 0x00000000ffd10e4c NPC = 0x00000000ffd10e50 Stopping execution

(gdb) l *0x00000000ffd10e4c 0xffd10e4c is in cfetch (../include/openbios/stack.h:34). 29 typedef ucell phandle_t; 30 31 32 33 34 static inline void PUSH(ucell value) { 35 dstack[++dstackcnt] = (value); 36 } 37 static inline void PUSH_xt( xt_t xt ) { PUSH( (ucell)xt ); } 38 static inline void PUSH_ih( ihandle_t ih ) { PUSH( (ucell)ih ); }

Right, exception is there still. Though I found the real issue, return value of mem_claim has reversed lo/hi cells of physical address. The following patch fixes that, but r639 must be reverted. With this change my HelenOS disk boots as before r639 (it does not use mem_claim obviously, and mmu_map appears to be OK indeed.)

Signed-off-by: igor.v.kovalenko@gmail.com

Index: openbios-devel/arch/sparc64/lib.c

--- openbios-devel.orig/arch/sparc64/lib.c +++ openbios-devel/arch/sparc64/lib.c @@ -376,8 +376,8 @@ mem_claim( void )

ofmem_map(phys, phys, size, -1);

PUSH(phys >> 32);

PUSH(phys & 0xffffffffUL);

PUSH(phys >> 32);

}

/* ( phys size --- ) */

Shouldn't we then have something like the attached patch for consistency?

ACK. I missed mem_release physical address decoding.

...
Anyway I don't see any benefit compared to r639 and we lose the milaX progress.

We get out of loop compared to r638 I believe so there is still a progress.

OK, I applied the patch then.

Tarl Neustaedter

4 Dec 4 Dec

8:33 p.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

Mark Cave-Ayland wrote:

...

[...] ii) The first request allocated at 0x8000000 is mapped to physical RAM location 0x0. This looks wrong, as normally low pages contain special registers for various things - perhaps we need some kind of base offset here?

Not on SPARC. Memory zero is fine.

...

iii) The second request allocated at 0x8002000 is mapped to physical RAM location 0x200000000000. This is definitely wrong since the maximum physical RAM location is 0x8000000.

iv) The third request allocated at 0x8004000 is mapped to physical RAM location 0x400000000000. Again, this is definitely wrong since the maximum physical RAM location is 0x8000000.

Ahhh. Methinks you have a 32-64 bit problem. The addresses are correct if you look at the upper 32 bits.

Nick Couchman

1:15 a.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

...

...
...
On 2009/12/03 at 16:10, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk

wrote:

...

Nick Couchman wrote:

...
The Solaris Nevada code still gets the unhandled exception in cfetch.

-Nick

Continuing to work on Milax, I've found out that the infinite loop bug I'm getting here is because of an MMU problem - two different virtual addresses appear to be mapped to the same physical address. If the memory management is broken then all sorts of random things are likely to break based upon their address, which is why I'm not surprised you're still seeing some failures. It's probably going to take a bit of time to debug :(

Anything I can do to help? Any output that would be helpful from my setup?

-Nick

Mark Cave-Ayland

1:49 a.m.

New subject: r638 - in trunk/openbios-devel/forth: bootstrap device

Nick Couchman wrote:

...

Anything I can do to help? Any output that would be helpful from my setup?

-Nick

Not that I can think of at the moment - the good thing about virtualisation is that everyone is using the same machine and so getting the debug information is not particularly hard. Fortunately I can see what the problem is, it's just finding out how to fix it...

ATB,

Mark.

5630

days inactive

5632

days old

openbios@openbios.org

16 comments

5 participants

tags (0)

participants (5)

Blue Swirl
Igor Kovalenko
Mark Cave-Ayland
Nick Couchman
Tarl Neustaedter