On Tue, Apr 10, 2012 at 14:30, Mark Cave-Ayland mark.cave-ayland@ilande.co.uk wrote:
On 07/04/12 12:09, Blue Swirl wrote:
This patchset aims to tidy up various parts of OFMEM, with the ultimate aim of reducing the image sizes. The first 4 patches perform the reorganisation, while the remainder of the patches ensure that video memory and the Forth machine memory are no longer contained within the image.
Patches look OK. Coding style could be adjusted while moving the code and 'extern' is not useful in C for function prototypes.
Okay great :) I know you've mentioned the extern in the past, however I'd like to focus on making the API more consistent first and then do a prototype clear-up once all of that is in place.
Note: the resulting SPARC64 image is now approximately half of its original size - BUT because we require 512K alignment for various parts of the image because of the 512K MMU page size, our ability to claim this space back directly is limited. Any further ideas on this would be welcomed.
At least FSYS_BUF in grubfs could be dynamically allocated and/or shrunk, maybe some others also from this list: $ nm --size --size-sort obj-sparc64/openbios-builtin.elf.nostrip | egrep -v ' [tTrR] ' | tail 0000000000000100 b dp2.1303 0000000000000100 b file_descriptors 0000000000000100 b obio_cmdline 0000000000000118 d main_ctx 0000000000001000 B dstack 0000000000001000 B rstack 0000000000004000 b image_stack 0000000000008000 B FSYS_BUF 0000000000080000 d forth_dictionary 00000000000c0000 b s_ofmem_data
It wouldn't help here, but sections .text and .rodata could be merged, GCC already assumes that .text is readable. This would save one TLB entry since combined size is still< 512k.
With my patchset applied, SPARC64 now looks like this:
$ nm --size --size-sort obj-sparc64/openbios-builtin.elf.nostrip | egrep -v ' [tTrR] ' | tail 0000000000000100 b dp2.1302
0000000000000100 b file_descriptors 0000000000000100 b obio_cmdline 0000000000000118 d main_ctx 0000000000001000 B dstack 0000000000001000 B rstack 0000000000004000 b image_stack 0000000000008000 B FSYS_BUF 0000000000020000 b s_ofmem_data 0000000000080000 d forth_dictionary
So by reducing s_ofmem_data, the main component is now the size of the Forth dictionary. Hmmm I've had an idea though - we could compress the dictionary, and then decompress/relocate into another set of OFMEM-allocated memory in a similar way so that the entire dictionary doesn't have to be part of the image.
For example:
$ ls -l obj-sparc64/openbios-sparc64.dict -rw-r--r-- 1 build build 181560 Apr 10 15:08 obj-sparc64/openbios-sparc64.dict
Now we need to allocate 512K in total within the image in order to allow for new entries created by bootloaders etc. If we compress the image aswell, e.g. using gzip:
$ ls -l obj-sparc64/openbios-sparc64.dict.gz -rw-r--r-- 1 build build 37933 Apr 10 15:08 obj-sparc64/openbios-sparc64.dict.gz
Then we can dramatically reduce the dictionary usage within the file too. I wonder if we could incorporate a simple decompressor based upon something like pdlzip (http://lzip.nongnu.org/pdlzip.html) to perform LZMA decompression/relocation on the dictionary early within the startup process?
ROM image size is not very interesting, but CPU cycles wasted at startup are. I think run time memory needs would not be changed by compression, the dictionary needs to be uncompressed to memory anyway before it can be used.
But we could avoid copying the dictionary when copying ROM to RAM initially. Then the dictionary could be copied to OFMEM allocated area before Forth start. Some large arrays (FSYS_BUF, dstack, rstack etc.) could also be allocated before use. Would this help?
ATB,
Mark.