On 07/04/12 12:09, Blue Swirl wrote:
This patchset aims to tidy up various parts of OFMEM, with the ultimate aim of reducing the image sizes. The first 4 patches perform the reorganisation, while the remainder of the patches ensure that video memory and the Forth machine memory are no longer contained within the image.
Patches look OK. Coding style could be adjusted while moving the code and 'extern' is not useful in C for function prototypes.
Okay great :) I know you've mentioned the extern in the past, however I'd like to focus on making the API more consistent first and then do a prototype clear-up once all of that is in place.
Note: the resulting SPARC64 image is now approximately half of its original size - BUT because we require 512K alignment for various parts of the image because of the 512K MMU page size, our ability to claim this space back directly is limited. Any further ideas on this would be welcomed.
At least FSYS_BUF in grubfs could be dynamically allocated and/or shrunk, maybe some others also from this list: $ nm --size --size-sort obj-sparc64/openbios-builtin.elf.nostrip | egrep -v ' [tTrR] ' | tail 0000000000000100 b dp2.1303 0000000000000100 b file_descriptors 0000000000000100 b obio_cmdline 0000000000000118 d main_ctx 0000000000001000 B dstack 0000000000001000 B rstack 0000000000004000 b image_stack 0000000000008000 B FSYS_BUF 0000000000080000 d forth_dictionary 00000000000c0000 b s_ofmem_data
It wouldn't help here, but sections .text and .rodata could be merged, GCC already assumes that .text is readable. This would save one TLB entry since combined size is still< 512k.
With my patchset applied, SPARC64 now looks like this:
$ nm --size --size-sort obj-sparc64/openbios-builtin.elf.nostrip | egrep -v ' [tTrR] ' | tail 0000000000000100 b dp2.1302 0000000000000100 b file_descriptors 0000000000000100 b obio_cmdline 0000000000000118 d main_ctx 0000000000001000 B dstack 0000000000001000 B rstack 0000000000004000 b image_stack 0000000000008000 B FSYS_BUF 0000000000020000 b s_ofmem_data 0000000000080000 d forth_dictionary
So by reducing s_ofmem_data, the main component is now the size of the Forth dictionary. Hmmm I've had an idea though - we could compress the dictionary, and then decompress/relocate into another set of OFMEM-allocated memory in a similar way so that the entire dictionary doesn't have to be part of the image.
For example:
$ ls -l obj-sparc64/openbios-sparc64.dict -rw-r--r-- 1 build build 181560 Apr 10 15:08 obj-sparc64/openbios-sparc64.dict
Now we need to allocate 512K in total within the image in order to allow for new entries created by bootloaders etc. If we compress the image aswell, e.g. using gzip:
$ ls -l obj-sparc64/openbios-sparc64.dict.gz -rw-r--r-- 1 build build 37933 Apr 10 15:08 obj-sparc64/openbios-sparc64.dict.gz
Then we can dramatically reduce the dictionary usage within the file too. I wonder if we could incorporate a simple decompressor based upon something like pdlzip (http://lzip.nongnu.org/pdlzip.html) to perform LZMA decompression/relocation on the dictionary early within the startup process?
ATB,
Mark.
On 10/04/12 15:30, Mark Cave-Ayland wrote:
So by reducing s_ofmem_data, the main component is now the size of the Forth dictionary. Hmmm I've had an idea though - we could compress the dictionary, and then decompress/relocate into another set of OFMEM-allocated memory in a similar way so that the entire dictionary doesn't have to be part of the image.
For example:
$ ls -l obj-sparc64/openbios-sparc64.dict -rw-r--r-- 1 build build 181560 Apr 10 15:08 obj-sparc64/openbios-sparc64.dict
Now we need to allocate 512K in total within the image in order to allow for new entries created by bootloaders etc. If we compress the image aswell, e.g. using gzip:
$ ls -l obj-sparc64/openbios-sparc64.dict.gz -rw-r--r-- 1 build build 37933 Apr 10 15:08 obj-sparc64/openbios-sparc64.dict.gz
Then we can dramatically reduce the dictionary usage within the file too. I wonder if we could incorporate a simple decompressor based upon something like pdlzip (http://lzip.nongnu.org/pdlzip.html) to perform LZMA decompression/relocation on the dictionary early within the startup process?
Actually, it seems as if zlib contains a miniature inflator called puff which seems license-friendly and designed for embedded applications:
/* puff.h Copyright (C) 2002-2010 Mark Adler, all rights reserved version 2.2, 25 Apr 2010
This software is provided 'as-is', without any express or implied warranty. In no event will the author be held liable for any damages arising from the use of this software.
Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution.
Mark Adler madler@alumni.caltech.edu */
Does anyone see any reason why we couldn't use this code within OpenBIOS to decompress the Forth dictionary in OFMEM-allocated memory at runtime? I think with a little bit of extra work we could simply use gzip on the command line in order to compress the dictionary at build time.
ATB,
Mark.
On Tue, Apr 10, 2012 at 14:30, Mark Cave-Ayland mark.cave-ayland@ilande.co.uk wrote:
On 07/04/12 12:09, Blue Swirl wrote:
This patchset aims to tidy up various parts of OFMEM, with the ultimate aim of reducing the image sizes. The first 4 patches perform the reorganisation, while the remainder of the patches ensure that video memory and the Forth machine memory are no longer contained within the image.
Patches look OK. Coding style could be adjusted while moving the code and 'extern' is not useful in C for function prototypes.
Okay great :) I know you've mentioned the extern in the past, however I'd like to focus on making the API more consistent first and then do a prototype clear-up once all of that is in place.
Note: the resulting SPARC64 image is now approximately half of its original size - BUT because we require 512K alignment for various parts of the image because of the 512K MMU page size, our ability to claim this space back directly is limited. Any further ideas on this would be welcomed.
At least FSYS_BUF in grubfs could be dynamically allocated and/or shrunk, maybe some others also from this list: $ nm --size --size-sort obj-sparc64/openbios-builtin.elf.nostrip | egrep -v ' [tTrR] ' | tail 0000000000000100 b dp2.1303 0000000000000100 b file_descriptors 0000000000000100 b obio_cmdline 0000000000000118 d main_ctx 0000000000001000 B dstack 0000000000001000 B rstack 0000000000004000 b image_stack 0000000000008000 B FSYS_BUF 0000000000080000 d forth_dictionary 00000000000c0000 b s_ofmem_data
It wouldn't help here, but sections .text and .rodata could be merged, GCC already assumes that .text is readable. This would save one TLB entry since combined size is still< 512k.
With my patchset applied, SPARC64 now looks like this:
$ nm --size --size-sort obj-sparc64/openbios-builtin.elf.nostrip | egrep -v ' [tTrR] ' | tail 0000000000000100 b dp2.1302
0000000000000100 b file_descriptors 0000000000000100 b obio_cmdline 0000000000000118 d main_ctx 0000000000001000 B dstack 0000000000001000 B rstack 0000000000004000 b image_stack 0000000000008000 B FSYS_BUF 0000000000020000 b s_ofmem_data 0000000000080000 d forth_dictionary
So by reducing s_ofmem_data, the main component is now the size of the Forth dictionary. Hmmm I've had an idea though - we could compress the dictionary, and then decompress/relocate into another set of OFMEM-allocated memory in a similar way so that the entire dictionary doesn't have to be part of the image.
For example:
$ ls -l obj-sparc64/openbios-sparc64.dict -rw-r--r-- 1 build build 181560 Apr 10 15:08 obj-sparc64/openbios-sparc64.dict
Now we need to allocate 512K in total within the image in order to allow for new entries created by bootloaders etc. If we compress the image aswell, e.g. using gzip:
$ ls -l obj-sparc64/openbios-sparc64.dict.gz -rw-r--r-- 1 build build 37933 Apr 10 15:08 obj-sparc64/openbios-sparc64.dict.gz
Then we can dramatically reduce the dictionary usage within the file too. I wonder if we could incorporate a simple decompressor based upon something like pdlzip (http://lzip.nongnu.org/pdlzip.html) to perform LZMA decompression/relocation on the dictionary early within the startup process?
ROM image size is not very interesting, but CPU cycles wasted at startup are. I think run time memory needs would not be changed by compression, the dictionary needs to be uncompressed to memory anyway before it can be used.
But we could avoid copying the dictionary when copying ROM to RAM initially. Then the dictionary could be copied to OFMEM allocated area before Forth start. Some large arrays (FSYS_BUF, dstack, rstack etc.) could also be allocated before use. Would this help?
ATB,
Mark.