Hi folks,
It's taken me a lot longer to get to grips with moving SPARC32 to OFMEM than expected, mostly due to problems with the increased BSS size of the resulting image causing several large headaches.
Having spent some time looking at where the memory is currently being used, it strikes me that there are 2 main places where we can claim some back:
i) Reduce (remove) the runtime memory allocated to the Forth machine
One interesting aspect of the current design is that we have 2 memory allocation ranges - the Forth machine memory which is used for alloc-mem and free-mem, and also the OFMEM ranges. For example, in SPARC32 this is set to 256K which given the large I/O space requirements for the frame buffer, is about a quarter of the total memory.
I'd like to suggest that we unify the memory management by removing the Forth implementations of alloc-mem/free-mem and replace them with wrappers to the internal ofmem_malloc() and ofmem_free() functions. This would then enable us to have one continuous pool of allocatable memory which could be used for everything.
One slight issue with this is that there are a few references to alloc-mem/free-mem within the internal Forth code. These are mainly for allocating static buffers, and so I believe these could be removed and replaced by functions that simply allocate memory within the dictionary itself.
ii) Avoid re-allocation of memory for the Forth dictionary
At runtime we currently have two copies of the dictionary held within memory - the first is the actual static dictionary data, while the second is the relocated dictionary image within the BSS. In the case of SPARC32 this is responsible for just over 100K of the BSS image size.
Rather than having to allocate a second copy of the dictionary, would it make sense to embed the fixed size dictionary directly within the OpenBIOS image? As an example SPARC32 specifies 256K for the total dictionary size.
I think it should be possible to rewrite the relocation routine to relocate the dictionary 'in place' and then remove the relocation data so that the dictionary can extend normally up to its final limit. The downside of this is that the ROM images will be bigger on disk, but I don't feel that this should be too much of an issue.
I realise that these two ideas are probably fairly controversial, but they should both help considerably reduce the size of the OpenBIOS runtime memory footprint. Thoughts/comments/criticisms?
ATB,
Mark.
Hi Mark,
Am 21.11.2010 um 22:33 schrieb Mark Cave-Ayland:
It's taken me a lot longer to get to grips with moving SPARC32 to OFMEM than expected, mostly due to problems with the increased BSS size of the resulting image causing several large headaches.
Not reading any replies to this yet, maybe you could explain the problem in more detail for those of us not as familiar with it? 256K sound like peanuts compared to ~128M default RAM size for SS-5. ;)
Until recently ppc memory layout didn't scale with RAM size either, so that I had to reorganize the ppc ofmem code. So what exactly is stored in BSS on sparc32, and can't it be moved to memory below/above OpenBIOS itself if you remember to claim that memory?
Regards, Andreas
Andreas Färber wrote:
It's taken me a lot longer to get to grips with moving SPARC32 to OFMEM than expected, mostly due to problems with the increased BSS size of the resulting image causing several large headaches.
Not reading any replies to this yet, maybe you could explain the problem in more detail for those of us not as familiar with it? 256K sound like peanuts compared to ~128M default RAM size for SS-5. ;)
Until recently ppc memory layout didn't scale with RAM size either, so that I had to reorganize the ppc ofmem code. So what exactly is stored in BSS on sparc32, and can't it be moved to memory below/above OpenBIOS itself if you remember to claim that memory?
Firstly, the Open Firmware SPARC bindings specifies "When a client program begins execution, an Open Firmware implementation’s use of any virtual address space outside of the ranges 0xffd0.0000-0xffef.ffff and 0xfe00.0000-0xfeff.ffff shall have ceased, except for the virtual address space and associated memory that is allocated for the client program’s code and data, as specified in the client program header.".
It seems that the second 0xfe00 range is not recommended for general use, and the specification suggests that any allocations from this region should be made from the top downwards.
OpenBIOS is currently set to reside in the 0xffd0.0000-0xffef.ffff memory range giving 2MB space. Currently on SPARC32 the allocation breakdown is roughly this:
Basic OpenBIOS image (inc. Forth dictionary): 100K SPARC32 page tables: 256K (some losses due to alignment) Forth (alloc-mem) area: 128K Forth internal malloc() area: 256K Relocated copy of Forth dictionary: 256K IOMEM: 1MB (768K VGA framebuffer + 256K other)
I see that for PPC in packages/video.c::init_video the framebuffer is allocated using OFMEM and then mapped 1:1. Blue has suggested that if direct access to the framebuffer is not required then we could try a similar dynamic allocation for SPARC - does anyone how/where OpenBoot stores its VGA framebuffer?
ATB,
Mark.
On 2010-11-30 4:14 PM, Mark Cave-Ayland wrote:
[...] I see that for PPC in packages/video.c::init_video the framebuffer is allocated using OFMEM and then mapped 1:1. Blue has suggested that if direct access to the framebuffer is not required then we could try a similar dynamic allocation for SPARC - does anyone how/where OpenBoot stores its VGA framebuffer?
Assuming the AST driver is VGA (I think it is, but I'm not all that familiar with VGA), it does a simple map-in of BAR 10 for size 80.0000, and smaller amounts for BARs 14 and 18. That probably lands it in the 0xfexx.xxxx window.
Tarl Neustaedter wrote:
I see that for PPC in packages/video.c::init_video the framebuffer is allocated using OFMEM and then mapped 1:1. Blue has suggested that if direct access to the framebuffer is not required then we could try a similar dynamic allocation for SPARC - does anyone how/where OpenBoot stores its VGA framebuffer?
Assuming the AST driver is VGA (I think it is, but I'm not all that familiar with VGA), it does a simple map-in of BAR 10 for size 80.0000, and smaller amounts for BARs 14 and 18. That probably lands it in the 0xfexx.xxxx window.
Look at the Debian sparc-utils prtconf example output, I see the following for the SS-5:
Node 0xffd41b1c address: ffdcd000 character-set: 'ISO8859-1' intr: 00000039.00000000 reg: 00000003.00000000.01000000 dblbuf: 00000000 vmsize: 00000001 depth: 00000008 height: 00000384 awidth: 00000480 linebytes: 00000480 width: 00000480 emulation: 'cgsix' montype: 00000004 boardrev: 000000a1 pixfreq: 066ff300 hfreq: 00011880 vfreq: 0000004c hbporch: 000000c0 hsync: 00000080 hfporch: 00000020 vbporch: 00000021 vsync: 00000008 vfporch: 00000002 fbmapped: 00100000 global-data: ffef8f00 oscillators: '84375000,64125000,108000000,94500000' chiprev: 0000000b device_type: 'display' model: 'SUNW,501-2325' name: 'cgsix'
This seems to suggest memory within the 0xff rather than the 0xfe range. Interestingly enough, the screen device looks like this:
screen: '/iommu@0,10000000/sbus@0,10001000/cgsix@3,0'
Perhaps it's hiding on the SBUS and so it doesn't need a direct mapping? Does the IOMMU allow paged access into SBUS space to avoid having to map it in its entirety?
ATB,
Mark.
On 2010-12-2 6:21 AM, Mark Cave-Ayland wrote:
[...] This seems to suggest memory within the 0xff rather than the 0xfe range. Interestingly enough, the screen device looks like this:
screen: '/iommu@0,10000000/sbus@0,10001000/cgsix@3,0'
Perhaps it's hiding on the SBUS and so it doesn't need a direct mapping? Does the IOMMU allow paged access into SBUS space to avoid having to map it in its entirety?
Not sure I understand your question - but yes, the IOMMU (IO memory management unit) allows paged access into SBUS space. When "map-in" is called, it maps only the specific pages into virtual memory without bringing the rest of SBUS in. We do the same on PCI without having a separate iommu node.
I'll admit that SBUS is at the very edge of my knowledge - my first project at Sun was PCI for SPARC (1994), so SBUS was already on its way out.