Author: mcayland Date: Sat Apr 10 13:08:57 2010 New Revision: 739 URL: http://tracker.coreboot.org/trac/openbios/changeset/739
Log: Don't try and update the MMU mappings when claiming physical memory. This is probably related to r737 and seems to solve an issue whereby Forth starts doing strange things just after attemping a claim on physical memory resource.
Signed-off-by: Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk
Modified: trunk/openbios-devel/libopenbios/ofmem_common.c
Modified: trunk/openbios-devel/libopenbios/ofmem_common.c ============================================================================== --- trunk/openbios-devel/libopenbios/ofmem_common.c Thu Apr 8 23:31:44 2010 (r738) +++ trunk/openbios-devel/libopenbios/ofmem_common.c Sat Apr 10 13:08:57 2010 (r739) @@ -401,8 +401,6 @@ } add_entry( phys, size, &ofmem->phys_range );
- ofmem_update_translations(); - return phys; }
On Sat, Apr 10, 2010 at 3:08 PM, repository service svn@openbios.org wrote:
Author: mcayland Date: Sat Apr 10 13:08:57 2010 New Revision: 739 URL: http://tracker.coreboot.org/trac/openbios/changeset/739
Log: Don't try and update the MMU mappings when claiming physical memory. This is probably related to r737 and seems to solve an issue whereby Forth starts doing strange things just after attemping a claim on physical memory resource.
Signed-off-by: Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk
Modified: trunk/openbios-devel/libopenbios/ofmem_common.c
Modified: trunk/openbios-devel/libopenbios/ofmem_common.c
--- trunk/openbios-devel/libopenbios/ofmem_common.c Thu Apr 8 23:31:44 2010 (r738) +++ trunk/openbios-devel/libopenbios/ofmem_common.c Sat Apr 10 13:08:57 2010 (r739) @@ -401,8 +401,6 @@ } add_entry( phys, size, &ofmem->phys_range );
- ofmem_update_translations();
return phys; }
This ofmem_update_translations call is supposed to keep available physical memory ranges up-to-date. There are no other places where we do update available phys ranges. Virtual ranges and MMU translations we update at the same time; since virtual allocations and MMU translations are not affected at this call site the resulting lists of virtual ranges and MMU translations are not changed.
The only thing which may change here is actual placement of "translations" property value, so if you keep pointer to previous location it may be stale after update call.
What is the issue you are solving here?
Igor Kovalenko wrote:
This ofmem_update_translations call is supposed to keep available physical memory ranges up-to-date. There are no other places where we do update available phys ranges. Virtual ranges and MMU translations we update at the same time; since virtual allocations and MMU translations are not affected at this call site the resulting lists of virtual ranges and MMU translations are not changed.
The only thing which may change here is actual placement of "translations" property value, so if you keep pointer to previous location it may be stale after update call.
What is the issue you are solving here?
It's an issue with Milax boot. Without r739 applied I get the following when stepping through the OpenSolaris kernel boot:
of_client_interface: call-method 10bb110 ffe483d8 0 8000 4c280000 call-method claim ([5] -- [2])
: call-method ( 4c280000 8000 0 ffe483d8 10bb110 ) 00000000ffe278f0: dup 5 > resume ok ( 4c280000 8000 0 ffe483d8 10bb110 ) 00000000ffe278f0: dup 5 > resume ok ( 4c280000 8000 0 ffe483d8 10bb110 ) 00000000ffe278f0: dup ( 4c280000 8000 0 ffe483d8 10bb110 10bb110 ) 00000000ffe278f8: 0= ( 4c280000 8000 0 ffe483d8 10bb110 0 ) 00000000ffe27900: do?branch ( 4c280000 8000 0 ffe483d8 10bb110 ) 00000000ffe27950: dup ( 4c280000 8000 0 ffe483d8 10bb110 10bb110 ) 00000000ffe27958: >r ( 4c280000 8000 0 ffe483d8 10bb110 ) 00000000ffe27960: dup ( 4c280000 8000 0 ffe483d8 10bb110 10bb110 ) 00000000ffe27968: cstrlen ( 4c280000 8000 0 ffe483d8 10bb110 5 ) 00000000ffe27970: rot ( 4c280000 8000 0 10bb110 5 ffe483d8 ) 00000000ffe27978: ?ihandle ( 4c280000 8000 0 10bb110 5 ffe483d8 ) 00000000ffe27980: (lit) ( 4c280000 8000 0 10bb110 5 ffe483d8 ffe13af0 ) 00000000ffe27990: catch OFMEM: ofmem_claim_virt virt=000000004c280000 size=0000000000008000 align=0000000000000000 ( 4c280000 0 ) 00000000ffe27998: dup ( 4c280000 0 0 ) 00000000ffe279a0: do?branch ( 4c280000 0 ) 00000000ffe27a38: r> ( 4c280000 0 10bb110 ) 00000000ffe27a40: drop ( 4c280000 0 ) 00000000ffe27a48: (semis) [ Finished call-method ] handle_calls return: 0 4c280000 of_client_interface: call-method 10bb0a8 ffe48e58 8 8000 call-method claim ([4] -- [3])
: call-method ( 8000 8 ffe48e58 10bb0a8 ) 00000000ffe278f0: dup 4 > resume ok ( 8000 8 ffe48e58 10bb0a8 ) 00000000ffe278f0: dup ( 8000 8 ffe48e58 10bb0a8 10bb0a8 ) 00000000ffe278f8: 0= ( 8000 8 ffe48e58 10bb0a8 0 ) 00000000ffe27900: do?branch ( 8000 8 ffe48e58 10bb0a8 ) 00000000ffe27950: dup ( 8000 8 ffe48e58 10bb0a8 10bb0a8 ) 00000000ffe27958: >r ( 8000 8 ffe48e58 10bb0a8 ) 00000000ffe27960: dup ( 8000 8 ffe48e58 10bb0a8 10bb0a8 ) 00000000ffe27968: cstrlen ( 8000 8 ffe48e58 10bb0a8 5 ) 00000000ffe27970: rot ( 8000 8 10bb0a8 5 ffe48e58 ) 00000000ffe27978: ?ihandle ( 8000 8 10bb0a8 5 ffe48e58 ) 00000000ffe27980: (lit) ( 8000 8 10bb0a8 5 ffe48e58 ffe13af0 ) 00000000ffe27990: catch OFMEM: ofmem_claim phys=ffffffffffffffff size=0000000000008000 align=0000000000000008 ( 57e2000 0 0 ) 00000000ffe27998: dup ( 57e2000 0 0 0 ) 00000000ffe279a0: do?branch ( 57e2000 0 0 ) 00000000ffe27a38: r> ( 57e2000 0 0 10bb0a8 ) 00000000ffe27a40: drop ( 57e2000 0 0 ) 00000000ffe27a48: (semis) [ Finished call-method ] handle_calls return: 0 0 57e2000 of_client_interface: call-method 10bb108 ffe483d8 ffffffffffffffff 8000 4c280000 0 57e2000 call-method map ([7] -- [1])
: call-method ( 57e2000 0 4c280000 8000 ffffffffffffffff ffe483d8 10bb108 ) 00000000ffe278f0: dup cmdline is missing! ( 57e2000 0 4c280000 8000 ffffffffffffffff ffe483d8 10bb108 )
See how I can drop into the Forth debugger using "f" and return using "resume". I don't have the exact trace to hand, however I have verified that I can drop into the Forth debugger right up until the catch ofmem_claim_phys is invoked. As soon as control returns to the "dup" afterwards, pressing "f" results in a "cmdline is missing!" error shown above.
Similarly, without r739 then the final CIF call to "translate" below dies mysteriously in "catch" - it doesn't even break gdb at mmu_translate :(
However, with r739 applied the debugger continues to work right up until the final exception, and the following CIF call to translate proceeds to work instead of dying mysteriously:
of_client_interface: call-method 10bb150 ffe483d8 180bae0 call-method translate ([3] -- [5]) ### XXX
: call-method ( 180bae0 ffe483d8 10bb150 ) 00000000ffe278f0: dup OFMEM: #### Internal memory allocation at 00000000ffece010, size 104 OFMEM: #### Internal memory allocation at 00000000ffece078, size 104 OFMEM: #### Internal memory allocation at 00000000ffece0e0, size 56
3 > resume ok ( 180bae0 ffe483d8 10bb150 ) 00000000ffe278f0: dup ( 180bae0 ffe483d8 10bb150 10bb150 ) 00000000ffe278f8: 0= ( 180bae0 ffe483d8 10bb150 0 ) 00000000ffe27900: do?branch ( 180bae0 ffe483d8 10bb150 ) 00000000ffe27950: dup ( 180bae0 ffe483d8 10bb150 10bb150 ) 00000000ffe27958: >r ( 180bae0 ffe483d8 10bb150 ) 00000000ffe27960: dup ( 180bae0 ffe483d8 10bb150 10bb150 ) 00000000ffe27968: cstrlen ( 180bae0 ffe483d8 10bb150 9 ) 00000000ffe27970: rot ( 180bae0 10bb150 9 ffe483d8 ) 00000000ffe27978: ?ihandle ( 180bae0 10bb150 9 ffe483d8 ) 00000000ffe27980: (lit) ( 180bae0 10bb150 9 ffe483d8 ffe13af0 ) 00000000ffe27990: catch ( 5c0bae0 0 32 ffffffffffffffff 0 ) 00000000ffe27998: dup OFMEM: #### Internal memory allocation at 00000000ffece118, size 104 OFMEM: #### Internal memory allocation at 00000000ffece180, size 104 OFMEM: #### Internal memory allocation at 00000000ffece1e8, size 56
5 > resume ok ( 5c0bae0 0 32 ffffffffffffffff 0 ) 00000000ffe27998: dup ( 5c0bae0 0 32 ffffffffffffffff 0 0 ) 00000000ffe279a0: do?branch ( 5c0bae0 0 32 ffffffffffffffff 0 ) 00000000ffe27a38: r> ( 5c0bae0 0 32 ffffffffffffffff 0 10bb150 ) 00000000ffe27a40: drop ( 5c0bae0 0 32 ffffffffffffffff 0 ) 00000000ffe27a48: (semis) [ Finished call-method ] handle_calls return: 0 ffffffffffffffff 32 0 5c0bae0 Unhandled Exception 0x0000000000000034 PC = 0x0000000001047444 NPC = 0x0000000001047448 Stopping execution
At least the PC value is now within the OpenSolaris kernel rather than memory owned by OpenBIOS. If you think this is not the correct fix and there is something else wrong instead, please suggest an alternative.
ATB,
Mark.
Igor Kovalenko wrote:
This ofmem_update_translations call is supposed to keep available physical memory ranges up-to-date. There are no other places where we do update available phys ranges. Virtual ranges and MMU translations we update at the same time; since virtual allocations and MMU translations are not affected at this call site the resulting lists of virtual ranges and MMU translations are not changed.
The only thing which may change here is actual placement of "translations" property value, so if you keep pointer to previous location it may be stale after update call.
What is the issue you are solving here?
I've just done some more experimentation with r738 by replacing the call to ofmem_update_translations() with its constituent functions.
If I replace it with the following:
ofmem_update_memory_available(s_phandle_memory, ofmem->phys_range, get_ram_size());
- Everything works okay.
If I replace it with:
ofmem_update_memory_available(s_phandle_memory, ofmem->phys_range, get_ram_size());
ofmem_update_memory_available(s_phandle_mmu, ofmem->virt_range, -1ULL);
- Everything still works okay.
If I replace it with:
ofmem_update_memory_available(s_phandle_memory, ofmem->phys_range, get_ram_size());
ofmem_update_memory_available(s_phandle_mmu, ofmem->virt_range, -1ULL);
ofmem_update_mmu_translations();
- Forth starts misbehaving and the translate call crashes.
This points the finger at ofmem_update_mmu_translations(). Given that this is a physical memory allocation, is there any reason why I shouldn't change r739 to instead of removing ofmem_update_translations() to just replace it with:
ofmem_update_memory_available(s_phandle_memory, ofmem->phys_range, get_ram_size());
This is because for a physical allocation there is no need to update the virt_range nor the MMU translations. And it seems to work so far in testing.
ATB,
Mark.
Mark Cave-Ayland wrote:
This points the finger at ofmem_update_mmu_translations()
In fact, it is possible to narrow down the culprit even further. If I comment out the set_property() call in ofmem_update_mmu_translations() then everything works fine. If I leave it in but change ncells to (ncells - 1) then that works too (but then of course the property array is one cell short).
It almost seems to suggest some kind of off-by-one or alignment error in the property setting code which is clobbering memory in Forth somewhere, but the specification seems to suggest that properties aren't aligned and a quick eye-ball of the property code hasn't yielded anything obvious.
Can anyone else see something I've missed?
Mark.