Hi all,
Thanks to some great work by Igor, the reason for my random debugger crashes has been found. It appears that the default dictionary size is too small on SPARC64 and repeatedly updating the translations property is enough to cause the forth dictionary to overflow the memory allocated for it.
I'd like to propose the attached patch which doubles the dictionary memory from 256k to 512k as currently used by PPC. My concern about applying the patch directly is that I'm not sure whether or not any corresponding changes to entry.S also need to be made. AFAICT the initialisation code handles 19 x 512k pages when setting up the MMU (which should be more than enough) but thought I should get a second opinion first.
ATB,
Mark.
On Tue, Apr 13, 2010 at 12:36 AM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
Hi all,
Thanks to some great work by Igor, the reason for my random debugger crashes has been found. It appears that the default dictionary size is too small on SPARC64 and repeatedly updating the translations property is enough to cause the forth dictionary to overflow the memory allocated for it.
I'd like to propose the attached patch which doubles the dictionary memory from 256k to 512k as currently used by PPC. My concern about applying the patch directly is that I'm not sure whether or not any corresponding changes to entry.S also need to be made. AFAICT the initialisation code handles 19 x 512k pages when setting up the MMU (which should be more than enough) but thought I should get a second opinion first.
Your changes increase .bss of elf image, and that is handled by entry.S automatically so no changes are needed.
BTW do we need 512K for forth heap? It may be enough to modify openbios.c setting MEMORY_SIZE to 256K and DICTIONARY_SIZE to 512k there. Change to ofmem_sparc64.c does not seem to be helpful.
Igor Kovalenko wrote:
Your changes increase .bss of elf image, and that is handled by entry.S automatically so no changes are needed.
Okay, great :)
BTW do we need 512K for forth heap? It may be enough to modify openbios.c setting MEMORY_SIZE to 256K and DICTIONARY_SIZE to 512k there. Change to ofmem_sparc64.c does not seem to be helpful.
I honestly don't know enough to make an informed choice, although my feeling is that since PPC has 512K (and it will obviously be enough to solve the problem with OpenSolaris) then it can't be too bad an option. Does anyone else have a feeling one way or the other?
The change to ofmem_sparc64.c is required to prevent an out-of-memory panic when starting OpenBIOS with a 512K Forth heap as otherwise the total amount of memory requested is too large for the ofmem code.
ATB,
Mark.
On 4/13/10, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
Igor Kovalenko wrote:
Your changes increase .bss of elf image, and that is handled by entry.S automatically so no changes are needed.
Okay, great :)
BTW do we need 512K for forth heap? It may be enough to modify openbios.c setting MEMORY_SIZE to 256K and DICTIONARY_SIZE to 512k there. Change to ofmem_sparc64.c does not seem to be helpful.
I honestly don't know enough to make an informed choice, although my feeling is that since PPC has 512K (and it will obviously be enough to solve the problem with OpenSolaris) then it can't be too bad an option. Does anyone else have a feeling one way or the other?
One possible problem may be that we overrun the 32 bit address space, because there is only 3M available: 0xffd00000 to 0xffffffff.
Blue Swirl wrote:
I honestly don't know enough to make an informed choice, although my feeling is that since PPC has 512K (and it will obviously be enough to solve the problem with OpenSolaris) then it can't be too bad an option. Does anyone else have a feeling one way or the other?
One possible problem may be that we overrun the 32 bit address space, because there is only 3M available: 0xffd00000 to 0xffffffff.
Hmmm. Would that matter even for a 64-bit architecture like SPARC64? It wouldn't affect any of the other architectures except for PPC which already has a default of 512K rather than 256K. Also even with MEMORY_SIZE and DICTIONARY_SIZE set to 512K, that's still just 1M...
ATB,
Mark.
On Tue, Apr 13, 2010 at 11:16 PM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
Blue Swirl wrote:
I honestly don't know enough to make an informed choice, although my feeling is that since PPC has 512K (and it will obviously be enough to solve the problem with OpenSolaris) then it can't be too bad an option. Does anyone else have a feeling one way or the other?
One possible problem may be that we overrun the 32 bit address space, because there is only 3M available: 0xffd00000 to 0xffffffff.
Hmmm. Would that matter even for a 64-bit architecture like SPARC64? It wouldn't affect any of the other architectures except for PPC which already has a default of 512K rather than 256K. Also even with MEMORY_SIZE and DICTIONARY_SIZE set to 512K, that's still just 1M...
At the moment there is no issue. For the future state we still may consider building property value in temporary space and use malloc interface to store it permanently so we can reuse memory.
On 4/13/10, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
Blue Swirl wrote:
I honestly don't know enough to make an informed choice, although my feeling is that since PPC has 512K (and it will obviously be enough to
solve
the problem with OpenSolaris) then it can't be too bad an option. Does anyone else have a feeling one way or the other?
One possible problem may be that we overrun the 32 bit address space, because there is only 3M available: 0xffd00000 to 0xffffffff.
Hmmm. Would that matter even for a 64-bit architecture like SPARC64? It wouldn't affect any of the other architectures except for PPC which already has a default of 512K rather than 256K. Also even with MEMORY_SIZE and DICTIONARY_SIZE set to 512K, that's still just 1M...
On PPC OpenBIOS is mapped much lower in memory.
End of bss is now at 0xffeee000 and end of stack at 0xfff12000. Though _iomem is already at 0x100012000, so maybe it's not so harmful.
Blue Swirl wrote:
On PPC OpenBIOS is mapped much lower in memory.
Okay.
End of bss is now at 0xffeee000 and end of stack at 0xfff12000. Though _iomem is already at 0x100012000, so maybe it's not so harmful.
The maths says that it shouldn't, and given that your knowledge of SPARC is much better than mine and you think maybe it's okay gives me some confidence ;) Let's see if anyone else has any ideas.
Incidentally, since you mention iomem... ;) Have you had a chance to try the Debian lenny netinst ISO for SPARC64 with OpenBIOS SVN yet? The fact that it puts lots of junk on the screen before it eventually crashes trying to run init makes me think that there could be another I/O space mapping issue somewhere. But I thought you/Igor had managed to resolve this, so I'm wondering if it's something else?
ATB,
Mark.
On 4/13/10, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
Blue Swirl wrote:
On PPC OpenBIOS is mapped much lower in memory.
Okay.
End of bss is now at 0xffeee000 and end of stack at 0xfff12000. Though _iomem is already at 0x100012000, so maybe it's not so harmful.
The maths says that it shouldn't, and given that your knowledge of SPARC is much better than mine and you think maybe it's okay gives me some confidence ;) Let's see if anyone else has any ideas.
Incidentally, since you mention iomem... ;) Have you had a chance to try the Debian lenny netinst ISO for SPARC64 with OpenBIOS SVN yet? The fact that it puts lots of junk on the screen before it eventually crashes trying to run init makes me think that there could be another I/O space mapping issue somewhere. But I thought you/Igor had managed to resolve this, so I'm wondering if it's something else?
I don't see the problem (5.02 businesscard): Welcome to Debian GNU/Linux lenny!
This is a Debian installation CDROM, built on 20090629-11:14. Keep it once you have installed your system, as you can boot from it to repair the system on your hard disk if that ever becomes necessary.
WARNING: You should completely back up all of your hard disks before proceeding. The installation procedure can completely and irreversibly erase them! If you haven't made backups yet, remove the rescue CD from the drive and press L1-A to get back to the OpenBoot prompt.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law.
[ ENTER - Boot install ] [ Type "expert" - Boot into expert mode ] [ Type "rescue" - Boot into rescue mode ] boot: Allocated 8 Megs of memory at 0x40000000 for kernel Loaded kernel version 2.6.26 Loading initial ramdisk (4300466 bytes at 0xC00000 phys, 0x40C00000 virt)... \ [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.10.24 1999/01/01 01:01' [ 0.000000] PROMLIB: Root node compatible: sun4u [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Linux version 2.6.26-2-sparc64 (Debian 2.6.26-17) (dannf@debian.org) (gcc version 4.1.3 20080704 (prerelease) (Debian 4.1.2-25)) #1 Sun Jun 21 04:31:33 UTC 2009 [ 0.000000] console [earlyprom0] enabled [ 0.000000] ARCH: SUN4U [ 0.000000] Ethernet address: 52:54:00:12:34:56 [ 0.000000] Kernel: Using 1 locked TLB entries for main kernel image. [ 0.000000] Remapping the kernel... done. [ 0.000000] OF stdout device is: /pci@1fe,0/pci@1/pci@1,1/ebus@3/su@1fe [ 0.000000] PROM: Built device tree with 32942 bytes of memory. [ 0.000000] Top of RAM: 0x7e80000, Total RAM: 0x7e80000 [ 0.000000] Memory hole size: 0MB [ 0.000000] [0000000200000000-fffff80000800000] page_structs=131072 node=0 entry=0/0 [ 0.000000] [0000000200000000-fffff80001400000] page_structs=131072 node=0 entry=1/0 [ 0.000000] Zone PFN ranges: [ 0.000000] Normal 0 -> 16192 [ 0.000000] Movable zone start PFN for each node [ 0.000000] early_node_map[1] active PFN ranges [ 0.000000] 0: 0 -> 16192 [ 0.000000] Booting Linux... [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 16081 [ 0.000000] Kernel command line: [ 0.000000] PID hash table entries: 512 (order: 9, 4096 bytes) [ 0.000000] clocksource: mult[a0000] shift[16] [ 0.000000] clockevent: mult[19999999] shift[32] [ 40.504885] Console: colour dummy device 80x25 [ 40.737695] console handover: boot [earlyprom0] -> real [tty0]
after this nothing happens. Same with graphical console.
Blue Swirl wrote:
boot: Allocated 8 Megs of memory at 0x40000000 for kernel Loaded kernel version 2.6.26 Loading initial ramdisk (4300466 bytes at 0xC00000 phys, 0x40C00000 virt)... \ [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.10.24 1999/01/01 01:01' [ 0.000000] PROMLIB: Root node compatible: sun4u [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Linux version 2.6.26-2-sparc64 (Debian 2.6.26-17) (dannf@debian.org) (gcc version 4.1.3 20080704 (prerelease) (Debian 4.1.2-25)) #1 Sun Jun 21 04:31:33 UTC 2009 [ 0.000000] console [earlyprom0] enabled [ 0.000000] ARCH: SUN4U [ 0.000000] Ethernet address: 52:54:00:12:34:56 [ 0.000000] Kernel: Using 1 locked TLB entries for main kernel image. [ 0.000000] Remapping the kernel... done. [ 0.000000] OF stdout device is: /pci@1fe,0/pci@1/pci@1,1/ebus@3/su@1fe [ 0.000000] PROM: Built device tree with 32942 bytes of memory. [ 0.000000] Top of RAM: 0x7e80000, Total RAM: 0x7e80000 [ 0.000000] Memory hole size: 0MB [ 0.000000] [0000000200000000-fffff80000800000] page_structs=131072 node=0 entry=0/0 [ 0.000000] [0000000200000000-fffff80001400000] page_structs=131072 node=0 entry=1/0 [ 0.000000] Zone PFN ranges: [ 0.000000] Normal 0 -> 16192 [ 0.000000] Movable zone start PFN for each node [ 0.000000] early_node_map[1] active PFN ranges [ 0.000000] 0: 0 -> 16192 [ 0.000000] Booting Linux... [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 16081 [ 0.000000] Kernel command line: [ 0.000000] PID hash table entries: 512 (order: 9, 4096 bytes) [ 0.000000] clocksource: mult[a0000] shift[16] [ 0.000000] clockevent: mult[19999999] shift[32] [ 40.504885] Console: colour dummy device 80x25 [ 40.737695] console handover: boot [earlyprom0] -> real [tty0]
after this nothing happens. Same with graphical console.
Hmmm that's weird. I get the following:
boot: Allocated 8 Megs of memory at 0x40000000 for kernel Loaded kernel version 2.6.26 Loading initial ramdisk (4312781 bytes at 0xC00000 phys, 0x40C00000 virt)... | [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.10.24 1999/01/01 01:01' [ 0.000000] PROMLIB: Root node compatible: sun4u [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Linux version 2.6.26-2-sparc64 (Debian 2.6.26-21) (dannf@debian.org) (gcc version 4.1.3 20080704 (prerelease) (Debian 4.1.2-25)) #1 Tue Jan 12 22:16:05 UTC 2010 [ 0.000000] console [earlyprom0] enabled [ 0.000000] ARCH: SUN4U [ 0.000000] Ethernet address: 52:54:00:12:34:56 [ 0.000000] Kernel: Using 1 locked TLB entries for main kernel image. [ 0.000000] Remapping the kernel... done. [ 0.000000] OF stdout device is: /pci@1fe,0/pci@1/pci@1,1/ebus@3/su@1fe [ 0.000000] PROM: Built device tree with 31881 bytes of memory. [ 0.000000] Top of RAM: 0x7e80000, Total RAM: 0x7e80000 [ 0.000000] Memory hole size: 0MB [ 0.000000] [0000000200000000-fffff80000800000] page_structs=131072 node=0 entry=0/0 [ 0.000000] [0000000200000000-fffff80001400000] page_structs=131072 node=0 entry=1/0 [ 0.000000] Zone PFN ranges: [ 0.000000] Normal 0 -> 16192 [ 0.000000] Movable zone start PFN for each node [ 0.000000] early_node_map[1] active PFN ranges [ 0.000000] 0: 0 -> 16192 [ 0.000000] Booting Linux... [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 16081 [ 0.000000] Kernel command line: [ 0.000000] PID hash table entries: 512 (order: 9, 4096 bytes) [ 0.000000] clocksource: mult[a0000] shift[16] [ 0.000000] clockevent: mult[19999999] shift[32] [ 7.357542] Console: colour dummy device 80x25 [ 7.400290] console handover: boot [earlyprom0] -> real [tty0]
And then if I leave it another 20-30s then I get lots of console output followed by eventually:
[ 29.264039] Console: switching to mono PROM 128x96 [ 42.284272] [drm] Initialized drm 1.1.0 20060810 [ 42.349671] su: probe of ffe2d760 failed with error -12 [ 42.452645] brd: module loaded [ 42.514544] loop: module loaded [ 42.563405] Uniform Multi-Platform E-IDE driver [ 42.624159] ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx [ 42.724442] mice: PS/2 mouse device common for all mice [ 42.798597] usbcore: registered new interface driver usbhid [ 42.869024] usbhid: v2.6:USB HID core driver [ 42.938149] TCP cubic registered [ 42.987376] NET: Registered protocol family 17 [ 43.050789] registered taskstats version 1 [ 43.109580] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
Could it be that you're running on quite a slow machine? The fact it takes your kernel 40s to get where mine does in 7s seems a bit strange.
Or perhaps it could be a regression in Qemu as I haven't updated for a while? git log shows my last commit as being:
commit 7a9773563c99a86aec454f9e14f7a19ca1f87659 Author: Edgar E. Iglesias edgar.iglesias@gmail.com Date: Mon Feb 15 11:47:34 2010 +0100
cris: Add v10 style interrupts.
Signed-off-by: Edgar E. Iglesias edgar.iglesias@gmail.com
I'll try a git pull again and see if I get anything different.
ATB,
Mark.
Mark Cave-Ayland wrote:
I'll try a git pull again and see if I get anything different.
Hmmm just tried that and I still get the same. Maybe try leaving it a little longer?
ATB,
Mark.
On 2010-4-13 3:16 PM, Mark Cave-Ayland wrote:
[...]
One possible problem may be that we overrun the 32 bit address space, because there is only 3M available: 0xffd00000 to 0xffffffff.
Hmmm. Would that matter even for a 64-bit architecture like SPARC64? It wouldn't affect any of the other architectures except for PPC which already has a default of 512K rather than 256K. Also even with MEMORY_SIZE and DICTIONARY_SIZE set to 512K, that's still just 1M...
Unfortunately, it does matter. FCode still expects to run in 32-bit mode and use 32-bit pointers, so the 64-bit address space doesn't buy us anything.
Unfortunately, it does matter. FCode still expects to run in 32-bit mode and use 32-bit pointers, so the 64-bit address space doesn't buy us anything.
FCode uses 64-bit pointers on a 64-bit OF. The only thing that's always 32-bit in FCode are the literals (and of course many badly-written FCode drivers do not work properly on 64-bit OF).
Segher
On 2010-4-13 11:19 PM, Segher Boessenkool wrote:
Unfortunately, it does matter. FCode still expects to run in 32-bit mode and use 32-bit pointers, so the 64-bit address space doesn't buy us anything.
FCode uses 64-bit pointers on a 64-bit OF. The only thing that's always 32-bit in FCode are the literals (and of course many badly-written FCode drivers do not work properly on 64-bit OF).
Right. *Most* existing FCode drivers out there are written under FCode-version2 rules, and expect 32-bit machines.
It's possible to write 64-bit FCode using FCode-version3, but few do. Most make implicit assumptions about truncating arithmetic operations at 32 bits and fitting pointers into 32-bit values (read/written with variants of l@/l!) even among the Fcode-Version3 drivers. Regrettable, but real.
I recently had an unpleasant experience porting the MD5 algorithm to 64-bit (needed to run under Forth rather than FCode), and turned the air blue finding all the places where bad assumptions were made.
2010/4/13 Blue Swirl blauwirbel@gmail.com:
On 4/13/10, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
Igor Kovalenko wrote:
Your changes increase .bss of elf image, and that is handled by entry.S automatically so no changes are needed.
Okay, great :)
BTW do we need 512K for forth heap? It may be enough to modify openbios.c setting MEMORY_SIZE to 256K and DICTIONARY_SIZE to 512k there. Change to ofmem_sparc64.c does not seem to be helpful.
I honestly don't know enough to make an informed choice, although my feeling is that since PPC has 512K (and it will obviously be enough to solve the problem with OpenSolaris) then it can't be too bad an option. Does anyone else have a feeling one way or the other?
One possible problem may be that we overrun the 32 bit address space, because there is only 3M available: 0xffd00000 to 0xffffffff.
Oh. I guess that's the amount stored in totmap[0].num_bytes in sparc32/lib.c? Then obp_dumb_memalloc really has too few space for loading Solaris kernel. Maybe because the memory is not freed.
On 4/14/10, Artyom Tarasenko atar4qemu@googlemail.com wrote:
2010/4/13 Blue Swirl blauwirbel@gmail.com:
On 4/13/10, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
Igor Kovalenko wrote:
Your changes increase .bss of elf image, and that is handled by entry.S automatically so no changes are needed.
Okay, great :)
BTW do we need 512K for forth heap? It may be enough to modify openbios.c setting MEMORY_SIZE to 256K and DICTIONARY_SIZE to 512k there. Change to ofmem_sparc64.c does not seem to be helpful.
I honestly don't know enough to make an informed choice, although my feeling is that since PPC has 512K (and it will obviously be enough to solve the problem with OpenSolaris) then it can't be too bad an option. Does anyone else have a feeling one way or the other?
One possible problem may be that we overrun the 32 bit address space, because there is only 3M available: 0xffd00000 to 0xffffffff.
Oh. I guess that's the amount stored in totmap[0].num_bytes in sparc32/lib.c? Then obp_dumb_memalloc really has too few space for loading Solaris kernel. Maybe because the memory is not freed.
That's only on Sparc32, but memory could be tight there too, the layout is the same.
There should be quite a few tricks to reduce memory consumption: compile with -Os, remove or make unused code conditional, optimize mixed C/Forth constructs like fword("xx") etc.
2010/4/14 Blue Swirl blauwirbel@gmail.com:
On 4/14/10, Artyom Tarasenko atar4qemu@googlemail.com wrote:
2010/4/13 Blue Swirl blauwirbel@gmail.com:
On 4/13/10, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
>> Igor Kovalenko wrote: >> >> >> > Your changes increase .bss of elf image, and that is handled by >> > entry.S automatically so no changes are needed. >> > >> >> Okay, great :) >> >> >> > BTW do we need 512K for forth heap? It may be enough to modify openbios.c >> > setting MEMORY_SIZE to 256K and DICTIONARY_SIZE to 512k there. >> > Change to ofmem_sparc64.c does not seem to be helpful. >> > >> >> I honestly don't know enough to make an informed choice, although my >> feeling is that since PPC has 512K (and it will obviously be enough to solve >> the problem with OpenSolaris) then it can't be too bad an option. Does >> anyone else have a feeling one way or the other? > > One possible problem may be that we overrun the 32 bit address space, > because there is only 3M available: 0xffd00000 to 0xffffffff.
Oh. I guess that's the amount stored in totmap[0].num_bytes in sparc32/lib.c? Then obp_dumb_memalloc really has too few space for loading Solaris kernel. Maybe because the memory is not freed.
That's only on Sparc32, but memory could be tight there too, the layout is the same.
There should be quite a few tricks to reduce memory consumption: compile with -Os, remove or make unused code conditional, optimize mixed C/Forth constructs like fword("xx") etc.
Right. This also explains why the code compiled with -o0 or debugging enabled doesn't go as far as the optimized code when using Solaris 2.5.1. Basically the memory corruption bug I reported was at least partly produced by the debugging method itself.
Artyom Tarasenko wrote:
One possible problem may be that we overrun the 32 bit address space, because there is only 3M available: 0xffd00000 to 0xffffffff.
Oh. I guess that's the amount stored in totmap[0].num_bytes in sparc32/lib.c? Then obp_dumb_memalloc really has too few space for loading Solaris kernel. Maybe because the memory is not freed.
That's only on Sparc32, but memory could be tight there too, the layout is the same.
There should be quite a few tricks to reduce memory consumption: compile with -Os, remove or make unused code conditional, optimize mixed C/Forth constructs like fword("xx") etc.
Right. This also explains why the code compiled with -o0 or debugging enabled doesn't go as far as the optimized code when using Solaris 2.5.1. Basically the memory corruption bug I reported was at least partly produced by the debugging method itself.
Is this correct? A build of SPARC32 OpenBIOS here with -O0 -g comes to about 560K which is well within the 3M limit.
I think from one of the logs you posted earlier that SPARC32 was invoking the a.out loader. From libopenbios/aout_load.c you can see that any a.out executables are loaded at a fixed address of 0x4000 which seems a little strange given that the start address is given in the executable header.
Perhaps it's worth changing start to N_TXTADDR(ehdr) as the comment in line 117 suggests (or checking to see if it's set to 0x4000 or not) because if the SPARC32 loader is like the SPARC64 loaders I've been looking at, they assume they load at a particular address (the header address) and thus set up their own MMU mappings accordingly. Hence if the executable is being loaded at the wrong address, the physical <-> virtual address mappings for these locations will be undefined which will cause unpredictable results.
This also means that you don't need to worry about any loaded binary program image having to fit within 3M because they are loaded directly into their memory target location, and not into the upper memory location first.
ATB,
Mark.