On Friday, March 14, 2014 12:23:03 PM Aaron Durbin wrote:
I found out that keeping track of the last map() pointer allows you to free the cache during unmap(), as currently, operations come in map/unap pairs. Sadly, this is just an accident of how CBFS core works. This design also places the burden of cache management on the backend, which also has to deal with whatever device it deals with. A pretty bad abstraction model.
Those map/unmap pairs are only when walking. The following functions offer no ability to free the used resources in the common case usage of CBFS_DEFAULT_MEDIA:
struct cbfs_file *cbfs_get_file(struct cbfs_media *media, const char *name); void *cbfs_get_file_content(struct cbfs_media *media, const char *name, int type, size_t *sz); const struct cbfs_header *cbfs_get_header(struct cbfs_media *media);
If one is using cbfs with those functions resources are leaked.
So CBFS design is really b0rk3d then. I don't think the excuse that coreboot runs only very briefly is valid in this case.
I don't think the cache management is a bad abstraction. It allows for the backing store to optimize, but I can see the downsides as well. All this talk is very cbfs-centric, but if something is needing multiple pieces of data at once from that then those buffers need to be managed. Either that is done explicitly by the caller or internally to the storage supplier. The issue is finding a solution that targets the least common denominator: not everything has a large amount of memory to play with at the time of cbfs accesses.
map(...) { void *buf = buffer_gen(size); read (offset, buf); return buf }
unmap(void * trilulilu) { buffer_free(trilulilu); }
And no, you can't use the simple_buffer concept to do this elegantly.
Another idea is to abstract this into a buffered_cbfs_reader class, which provides the CBFS backend with buffer management, then calls read(), and only read() on the backend media.
static blablabla buffered_reader_memory_map(blablabla) { void *buf = chuck_norris_buffer_create(size); media->context->read(offset, buf); return buf; }
static blablabla buffered_reader_unmap(void *blablabla) { chuck_norris_buffer_release(blablabla); }
int init_default_cbfs_media(struct cbfs_media *media) { static struct non_mmapped_backed default_backend;
default_mmapped_backed_init(&default_backend);
media->open = default_backend.open; media->close = default_backend.close; media->map = buffered_reader_memory_map; media->unmap = buffered_reader_unmap; media->read = default_backend.read; media->context = &default_backend; }
This is of course just a shim which connects CBFS, Chuck Norris buffer management, and a non-mmapped media backend. The buffer management is not necessarily an easy task, but this is as simple as it gets, given the current CBFS API. But, the difficulty now lies in making chuck_norris_buffer efficient, which is a smaller problem than the initial.
OK, this doesn't solve the issue of resource leakage. Here, the CBFS core needs to release its resources. It's a cleanup problem, however, not a trivial one.
The CBFS core doing map() + memcpy() doubles the resource requirements with no real gain.
I agree, but depending on where those memcpy()'s are happening it is somewhat minor (which I think is the current reality). I will point out that cbfs_meda has a read() function. You could have addressed that with a patch, no?
No. The design of the CBFS core generally assumes mmapped storage is used. That's the same reason you guys had a hard time refactoring the CBFS core back in 2011 for ARM. It's the same reason you guys didn't do the best job at it (no bashing/disrespect meant). It's the same reason we have this awkward caching need, and the same reason I can't fix this with just a simple patch. I estimate more than a dozen patches to do "get romstage from CBFS with only one read()" The Right Way.
I actually looked at reducing usage of map(). It's nowhere near as straightforward as it seems. You either break/create inefficient usage on MM, or non-MM media. This is an apple and orange problem,
Lastly, the big reason for large map() requests is because we don't have a pipelined lzma path.
Compression is "don't care" for the CBFS core until the stage is decompressed. By the time you realize you don't need to decompress, you've already map()ed the whole of romstage, which you now need to memcpy() to its destination.
Agreed, but see my list of functions above which are the cause of that. e.g. cbfs_load_stage() calls cbfs_get_file_content(). That's your big mapping. However, the compression case I brought up is still not possible to pipeline because of the lzma infrastructure we have. That still needs to be addressed.
Remember that "coreboot runs only very briefly"? Once you have RAM up, this excuse becomes valid. So anything in from late romstage onwards can malloc() whatever it pleases, and forget about free()ing it. Since we don't bother with compressed files until ramstage decompression, I don't generally care about such inefficiencies at this point.
Short story: There's no use case for optimizing lzma paths. So, if we can somehow figure out how to read(), instead of map() romstage while we're still in the bootblock...
I only initialize the CBFS mirror once. If the code finds a valid CBFS signature in RAM, it skips the whole re-read-this step.
Duh. I missed this piece:
if (CBFS_HEADER_MAGIC == ntohl(header->magic)) { printk(BIOS_ERR, "Already have CBFS master header at %p\n", header); return 0; }
I figured optimizations can't hurt :).
Alex
Just to step back here.
The flash is 8M. The system has 4G. What do I care if I leak memory? What's the issue you are solving?
ron
On Fri, Mar 14, 2014 at 1:54 PM, mrnuke mr.nuke.me@gmail.com wrote:
On Friday, March 14, 2014 12:23:03 PM Aaron Durbin wrote:
I found out that keeping track of the last map() pointer allows you to free the cache during unmap(), as currently, operations come in map/unap pairs. Sadly, this is just an accident of how CBFS core works. This design also places the burden of cache management on the backend, which also has to deal with whatever device it deals with. A pretty bad abstraction model.
Those map/unmap pairs are only when walking. The following functions offer no ability to free the used resources in the common case usage of CBFS_DEFAULT_MEDIA:
struct cbfs_file *cbfs_get_file(struct cbfs_media *media, const char *name); void *cbfs_get_file_content(struct cbfs_media *media, const char *name, int type, size_t *sz); const struct cbfs_header *cbfs_get_header(struct cbfs_media *media);
If one is using cbfs with those functions resources are leaked.
So CBFS design is really b0rk3d then. I don't think the excuse that coreboot runs only very briefly is valid in this case.
I don't think the cache management is a bad abstraction. It allows for the backing store to optimize, but I can see the downsides as well. All this talk is very cbfs-centric, but if something is needing multiple pieces of data at once from that then those buffers need to be managed. Either that is done explicitly by the caller or internally to the storage supplier. The issue is finding a solution that targets the least common denominator: not everything has a large amount of memory to play with at the time of cbfs accesses.
map(...) { void *buf = buffer_gen(size); read (offset, buf); return buf }
unmap(void * trilulilu) { buffer_free(trilulilu); }
And no, you can't use the simple_buffer concept to do this elegantly.
Another idea is to abstract this into a buffered_cbfs_reader class, which provides the CBFS backend with buffer management, then calls read(), and only read() on the backend media.
static blablabla buffered_reader_memory_map(blablabla) { void *buf = chuck_norris_buffer_create(size); media->context->read(offset, buf); return buf; }
static blablabla buffered_reader_unmap(void *blablabla) { chuck_norris_buffer_release(blablabla); }
int init_default_cbfs_media(struct cbfs_media *media) { static struct non_mmapped_backed default_backend;
default_mmapped_backed_init(&default_backend); media->open = default_backend.open; media->close = default_backend.close; media->map = buffered_reader_memory_map; media->unmap = buffered_reader_unmap; media->read = default_backend.read; media->context = &default_backend;
}
This is of course just a shim which connects CBFS, Chuck Norris buffer management, and a non-mmapped media backend. The buffer management is not necessarily an easy task, but this is as simple as it gets, given the current CBFS API. But, the difficulty now lies in making chuck_norris_buffer efficient, which is a smaller problem than the initial.
OK, this doesn't solve the issue of resource leakage. Here, the CBFS core needs to release its resources. It's a cleanup problem, however, not a trivial one.
The CBFS core doing map() + memcpy() doubles the resource requirements with no real gain.
I agree, but depending on where those memcpy()'s are happening it is somewhat minor (which I think is the current reality). I will point out that cbfs_meda has a read() function. You could have addressed that with a patch, no?
No. The design of the CBFS core generally assumes mmapped storage is used. That's the same reason you guys had a hard time refactoring the CBFS core back in 2011 for ARM. It's the same reason you guys didn't do the best job at it (no bashing/disrespect meant). It's the same reason we have this awkward caching need, and the same reason I can't fix this with just a simple patch. I estimate more than a dozen patches to do "get romstage from CBFS with only one read()" The Right Way.
I actually looked at reducing usage of map(). It's nowhere near as straightforward as it seems. You either break/create inefficient usage on MM, or non-MM media. This is an apple and orange problem,
Lastly, the big reason for large map() requests is because we don't have a pipelined lzma path.
Compression is "don't care" for the CBFS core until the stage is decompressed. By the time you realize you don't need to decompress, you've already map()ed the whole of romstage, which you now need to memcpy() to its destination.
Agreed, but see my list of functions above which are the cause of that. e.g. cbfs_load_stage() calls cbfs_get_file_content(). That's your big mapping. However, the compression case I brought up is still not possible to pipeline because of the lzma infrastructure we have. That still needs to be addressed.
Remember that "coreboot runs only very briefly"? Once you have RAM up, this excuse becomes valid. So anything in from late romstage onwards can malloc() whatever it pleases, and forget about free()ing it. Since we don't bother with compressed files until ramstage decompression, I don't generally care about such inefficiencies at this point.
I don't recall the "coreboot runs only very briefly." However, once you do have ram the question is where does one get the ram? I think you are trivializing the problem -- not impossible, but it's definitely not just malloc().
Short story: There's no use case for optimizing lzma paths. So, if we can somehow figure out how to read(), instead of map() romstage while we're still in the bootblock...
I only initialize the CBFS mirror once. If the code finds a valid CBFS signature in RAM, it skips the whole re-read-this step.
Duh. I missed this piece:
if (CBFS_HEADER_MAGIC == ntohl(header->magic)) { printk(BIOS_ERR, "Already have CBFS master header at %p\n", header); return 0; }
I figured optimizations can't hurt :).
Alex