On 19/08/08 19:26 +0200, Stefan Reinauer wrote:
-- coresystems GmbH • Brahmsstr. 16 • D-79104 Freiburg i. Br. Tel.: +49 761 7668825 • Fax: +49 761 7664613 Email: info@coresystems.de • http://www.coresystems.de/ Registergericht: Amtsgericht Freiburg • HRB 7656 Geschäftsführer: Stefan Reinauer • Ust-IdNr.: DE245674866
Add memalign, get aligned memory.
Signed-off-by: Patrick Georgi patrick.georgi@coresystems.de
This is a terrible amount of complexity for a simple aligned memory function. I realize you are trying to fit within the 16k boundaries, but come on - addr = ((addr + align - 1) / align) * align) is really all you need. Yes, I know that ends up wasting memory.
So more importantly then this individual patch, it seems pretty clear that we are getting ready to push the boundaries of the static 16k heap. The time has come to replace it a dynamic engine. The dynamic engine can take more advantage of the system memory, and we won't care so much if we end up losing 13 bytes with a 16 byte alignment.
The questions now are - where should we put the heap? What sort of implementation should we go for? Is the current mode scalable (probably not)? I probably favor a linked list implementation. What restrictions should we have on the size of the heap or the number of allocations?
Jordan
Index: include/libpayload.h
--- include/libpayload.h (revision 3519) +++ include/libpayload.h (working copy) @@ -158,6 +158,7 @@ void *malloc(size_t size); void *calloc(size_t nmemb, size_t size); void *realloc(void *ptr, size_t size); +void *memalign(size_t align, size_t size);
/* libc/exec.c */ int exec(long addr, int argc, char **argv); Index: libc/malloc.c =================================================================== --- libc/malloc.c (revision 3519) +++ libc/malloc.c (working copy) @@ -63,6 +63,7 @@ #define IS_FREE(_h) (((_h) & (MAGIC | FLAG_FREE)) == (MAGIC | FLAG_FREE)) #define HAS_MAGIC(_h) (((_h) & MAGIC) == MAGIC)
+static int free_aligned(void* addr); void print_malloc_map(void);
static void setup(void) @@ -164,6 +165,8 @@ { hdrtype_t hdr;
if (free_aligned(ptr)) return;
ptr -= HDRSIZE;
/* Sanity check. */
@@ -237,6 +240,109 @@ return ret; }
+struct align_region_t +{
- int alignment;
- /* start in memory, and size in bytes */
- void* start;
- int size;
- /* layout within a region:
- num_elements bytes, 0: free, 1: used, 2: used, combines with next
- padding to alignment
- data section
- waste space
start_data points to the start of the data section
- */
- void* start_data;
- /* number of free blocks sized "alignment" */
- int free;
- struct align_region_t *next;
+};
+static struct align_region_t* align_regions = 0;
+static struct align_region_t *allocate_region(struct align_region_t *old_first, int alignment, int num_elements) +{
- struct align_region_t *new_region = malloc(sizeof(struct align_region_t));
- new_region->alignment = alignment;
- new_region->start = malloc((num_elements+1) * alignment + num_elements);
- new_region->start_data = (void*)((u32)(new_region->start + num_elements + alignment - 1) & (~(alignment-1)));
- new_region->size = num_elements * alignment;
- new_region->free = num_elements;
- new_region->next = old_first;
- memset(new_region->start, 0, num_elements);
- return new_region;
+}
+static int free_aligned(void* addr) +{
- struct align_region_t *reg = align_regions;
- while (reg != 0)
- {
if ((addr >= reg->start_data) && (addr < reg->start_data + reg->size))
{
int i = (reg->start_data-addr)/reg->alignment;
while (((u8*)reg->start)[i]==2)
{
((u8*)reg->start)[i++]=0;
reg->free++;
}
((u8*)reg->start)[i]=0;
reg->free++;
return 1;
}
reg = reg->next;
- }
- return 0;
+}
+void *memalign(size_t align, size_t size) +{
- if (align_regions == 0) {
align_regions = malloc(sizeof(struct align_region_t));
memset(align_regions, 0, sizeof(struct align_region_t));
- }
- struct align_region_t *reg = align_regions;
+look_further:
- while (reg != 0)
- {
if ((reg->alignment == align) && (reg->free >= (size + align - 1)/align))
{
break;
}
reg = reg->next;
- }
- if (reg == 0)
- {
align_regions = allocate_region(align_regions, align, (size/align<99)?100:((size/align)+1));
reg = align_regions;
- }
- int i = 0, count, target = (size+align-1)/align;
- for (count = 0; i < (reg->size/align); i++)
- {
if (((u8*)reg->start)[i] == 0)
{
count++;
if (count == target) {
count = i+1-count;
for (i=0; i<target-1; i++)
{
((u8*)reg->start)[count+i]=2;
}
((u8*)reg->start)[target]=1;
reg->free -= target;
return reg->start_data+(align*count);
}
} else
{
count = 0;
}
- }
- goto look_further; // end condition is once a new region is allocated - it always has enough space
+}
/* This is for debugging purposes. */ #ifdef TEST void print_malloc_map(void)
-- coreboot mailing list coreboot@coreboot.org http://www.coreboot.org/mailman/listinfo/coreboot
Jordan Crouse wrote:
On 19/08/08 19:26 +0200, Stefan Reinauer wrote:
-- coresystems GmbH • Brahmsstr. 16 • D-79104 Freiburg i. Br. Tel.: +49 761 7668825 • Fax: +49 761 7664613 Email: info@coresystems.de • http://www.coresystems.de/ Registergericht: Amtsgericht Freiburg • HRB 7656 Geschäftsführer: Stefan Reinauer • Ust-IdNr.: DE245674866
Add memalign, get aligned memory.
Signed-off-by: Patrick Georgi patrick.georgi@coresystems.de
This is a terrible amount of complexity for a simple aligned memory function. I realize you are trying to fit within the 16k boundaries, but come on - addr = ((addr + align - 1) / align) * align) is really all you need. Yes, I know that ends up wasting memory.
I remember from Patrick's explanation was that the memory should be freeable with the normal free().
So a bit of complexity is also in that.
Patrick can explain this better.
Sure, it can be done simpler and less efficient. But from previous discussions I got the impression that this is not necessarily what we want.
So more importantly then this individual patch, it seems pretty clear that we are getting ready to push the boundaries of the static 16k heap. The time has come to replace it a dynamic engine. The dynamic engine can take more advantage of the system memory, and we won't care so much if we end up losing 13 bytes with a 16 byte alignment.
The questions now are - where should we put the heap? What sort of implementation should we go for? Is the current mode scalable (probably not)? I probably favor a linked list implementation. What restrictions should we have on the size of the heap or the number of allocations?
With regard to the fact that quite some payloads might want to load an OS at an arbitrary address, limiting the available memory to something considerably less than the maximum amount of memory makes sense. Does it?
If we agree on that, the "dynamic" part would merely be to determine a good _heap and _eheap address during runtime.
I'm not convinced yet, that the current scheme needs to be changed at all. It's working quite nicely for us. But then again, if you're going to make things more dynamic, I am all for it.
Stefan
On 19/08/08 20:24 +0200, Stefan Reinauer wrote:
Jordan Crouse wrote:
On 19/08/08 19:26 +0200, Stefan Reinauer wrote:
-- coresystems GmbH • Brahmsstr. 16 • D-79104 Freiburg i. Br. Tel.: +49 761 7668825 • Fax: +49 761 7664613 Email: info@coresystems.de • http://www.coresystems.de/ Registergericht: Amtsgericht Freiburg • HRB 7656 Geschäftsführer: Stefan Reinauer • Ust-IdNr.: DE245674866
Add memalign, get aligned memory.
Signed-off-by: Patrick Georgi patrick.georgi@coresystems.de
This is a terrible amount of complexity for a simple aligned memory function. I realize you are trying to fit within the 16k boundaries, but come on - addr = ((addr + align - 1) / align) * align) is really all you need. Yes, I know that ends up wasting memory.
I remember from Patrick's explanation was that the memory should be freeable with the normal free().
So a bit of complexity is also in that.
Patrick can explain this better.
Sure, it can be done simpler and less efficient. But from previous discussions I got the impression that this is not necessarily what we want.
So more importantly then this individual patch, it seems pretty clear that we are getting ready to push the boundaries of the static 16k heap. The time has come to replace it a dynamic engine. The dynamic engine can take more advantage of the system memory, and we won't care so much if we end up losing 13 bytes with a 16 byte alignment.
The questions now are - where should we put the heap? What sort of implementation should we go for? Is the current mode scalable (probably not)? I probably favor a linked list implementation. What restrictions should we have on the size of the heap or the number of allocations?
With regard to the fact that quite some payloads might want to load an OS at an arbitrary address, limiting the available memory to something considerably less than the maximum amount of memory makes sense. Does it?
If we agree on that, the "dynamic" part would merely be to determine a good _heap and _eheap address during runtime.
I'm not convinced yet, that the current scheme needs to be changed at all. It's working quite nicely for us. But then again, if you're going to make things more dynamic, I am all for it.
If it was working nicely for us, then memalign() would have been at most an extra line or two.
Jordan