Hi Melissa,
On 28.09.2017 07:59, Melissa Yi wrote:
Hi All, Does anyone know how to implement variable read/write in coreboot? like: how to create a block nvram to be accessed by uefipayload?
I don't know any quick solution. Though, here are my thoughts on the matter: AFAIK, coreboot currently doesn't have any specified format for in-flash, mutable configuration variables. Tianocore (or UEFI in general) might already have a format for that, but I would rather introduce something more native to coreboot that can easily be used in coreboot and all its payloads.
I'll try to describe my ideas for a complete solution (and where we already have parts implement and where not) in some layers from bot- tom up:
Flash-chip access =================
You need read/write access to the firmware flash chip. It's save to assume that coreboot at least knows how to read from its flash. And I would argue that writing variables from coreboot itself can be considered optional (or should even be discouraged). Though, when it comes to payloads, flash support is scarce. One option, and my preferred, is to use libflashrom that can currently be compiled and linked with libpayload. I guess it's not feasible to link the latter together with Tianocore (which seems to have its own implementation of the C library), so porting libflashrom to Tianocore is probably the way to go. (One alternative would be to only integrate drivers for the specific chipsets / chips one needs. But I can only advice against it. We should focus on one sophisticated code base for the matter and that is flashrom.)
Flash-chip partitioning =======================
coreboot uses its native FMAP to describe the layout of the flash chip. libpayload has some code for it. Should be easy to find a place and add a new entry for variables.
In-flash variable format ========================
I like to keep things simple and assume that a key=value approach suffices (for this layer). Though, as we have certainly to deal with NOR flash chips (and their erase blocks), we can't just store things in a table that would require an erase cycle for every change of a variable. Here is a random proposal:
VARIABLE MAGIC (4 bytes, for sanity checks) COMPLETE LENGTH (4 bytes, length in bytes of this entry) KEY LENGTH (4 bytes, length of the key name, if 0 or 0xffffffff, entry is invalidated) KEY NAME (KEY LENGTH bytes) VALUE (COMPLETE LENGTH - KEY_LENGTH - 12 bytes)
Read strategy: Either cache and index the whole thing, or read until the first valid entry with matching key.
Write strategy: Invalidate any entry with matching key, append after last entry if there is enough space in the current erase block. If not, finalize current erase block with an invalid entry spanning the free space of the block, write variable into next erase block. If we run out of erase blocks, defragment the whole thing (needs one com- plete erase block size to cache things, afaics)?
Application layer =================
Up to the use case. Could be anything from simple flags over UEFI runtime variables to full programs (e.g. tint easter-egg), I sup- pose.
Hope there is anything helpful in this mail, Nico
Nico Huber wrote:
In-flash variable format
I like to keep things simple and assume that a key=value approach suffices (for this layer). Though, as we have certainly to deal with NOR flash chips (and their erase blocks), we can't just store things in a table that would require an erase cycle for every change of a variable. Here is a random proposal:
We already have a simple coreboot-native key-value store: CBFS
Yes, there is some overhead with every CBFS entry, but IIRC the minimum alignment is 16 bytes.
Melissa, how many variables do you think are actually needed?
//Peter
On 30.09.2017 16:12, Peter Stuge wrote:
Nico Huber wrote:
In-flash variable format
I like to keep things simple and assume that a key=value approach suffices (for this layer). Though, as we have certainly to deal with NOR flash chips (and their erase blocks), we can't just store things in a table that would require an erase cycle for every change of a variable. Here is a random proposal:
We already have a simple coreboot-native key-value store: CBFS
Well, I never looked close, but everybody kept telling me for years that CBFS is not designed for updates.
Nico
Yes, there is some overhead with every CBFS entry, but IIRC the minimum alignment is 16 bytes.
Melissa, how many variables do you think are actually needed?
//Peter
Nico Huber wrote:
We already have a simple coreboot-native key-value store: CBFS
Well, I never looked close, but everybody kept telling me for years that CBFS is not designed for updates.
But it is!?
Not in-place updates, because CBFS isn't tied to eraseblock sizes, but a CBFS entry could be overwritten with 0x00, which makes readers scan for the next entry.
If we haven't already we can declare entries with zero length names as deleted. Then the length remains, and no scan is neccessary. That would keep optimal read speed.
//Peter
On 30.09.2017 19:28, Peter Stuge wrote:
Nico Huber wrote:
We already have a simple coreboot-native key-value store: CBFS
Well, I had a look at it now. And indeed it is a coreboot-native key- value store and could be used in the same way I imagined my proposed storage format. But it's not simple.
Well, I never looked close, but everybody kept telling me for years that CBFS is not designed for updates.
But it is!?
Could be.
Not in-place updates, because CBFS isn't tied to eraseblock sizes, but a CBFS entry could be overwritten with 0x00, which makes readers scan for the next entry.
We have CBFS_COMPONENT_DELETED (0) type. It basically already works like that.
If we haven't already we can declare entries with zero length names as deleted. Then the length remains, and no scan is neccessary. That would keep optimal read speed.
You are implying that read speed is already optimal?
What is really bothering me is that CBFS does two layers at once (key- value store and attribution), and it has to for its current purpose: CBFS can be sparse and has to be to allow entries at specific offsets or with specific alignment. We probably don't want to mix that with erasures. So I suppose we would want another FMAP entry for the era- sable key-value store anyway and its unlikely that we'd need any of the fancy CBFS features there. But if we account for full CBFS com- patibility in the defragmentation scheme (or whatever we'll call it), I fear, things will get way more complex then they need to be.
Nico
Hi!
Write strategy: Invalidate any entry with matching key, append after last entry if there is enough space in the current erase block. If not, finalize current erase block with an invalid entry spanning the free space of the block, write variable into next erase block. If we run out of erase blocks, defragment the whole thing (needs one com- plete erase block size to cache things, afaics)?
What happens when the power gets cut right between invalidating the old entry and writing the new entry? One of the design goals should be to always have a consistent state in the flash chip. So for the case that there is enough free space in the SPI NOR flash, I'd write the new entry and invalidate the old entry after that. For the defragmentation at least one erase block needs to be reserved so that the new entries can be written before the old ones are invalidated. The easiest method of wear leveling would be to use the blocks as a ring buffer; sure this doesn't minimize the number of erase cycles, but since NOR flash is quite robust and new entries probably won't get written that often, I'd probably prefer this less complex approach to increase the code readability and maintainability.
Read strategy: Either cache and index the whole thing, or read until the first valid entry with matching key.
The question I have here is what to do if there is more than one valid entry. Just using the first one introduces a bit of undetermined behavior (block order in the flash might be different after defragmentation). So either use some counter (beware of overflows!) on either the flash erase block (small overhead and rather efficient) or the entry (full region has to be either cached or read every time and needs more bytes than the counter per erase block) or accept the possibility of undetermined behavior (IMHO not a good idea). Also a good idea would be to invalidate everything except the newest entry when reading in the case that there is more than one valid entry.
For boolean options you can also reserve a word of 32 or 64 bits and flip one bit for every toggle and test if the number of flipped bits is even or odd, so you don't have to write a new entry every time the option gets toggled. Also works for n-bit values, but I'm not sure if it's worth the more complex code.
Having some sort of key-value store that can be written to during runtime would also be great for replacing the CMOS user option table thing.
Regards Felix
Hi Felix,
On 30.09.2017 20:28, Felix Held wrote:
Hi!
Write strategy: Invalidate any entry with matching key, append after last entry if there is enough space in the current erase block. If not, finalize current erase block with an invalid entry spanning the free space of the block, write variable into next erase block. If we run out of erase blocks, defragment the whole thing (needs one com- plete erase block size to cache things, afaics)?
What happens when the power gets cut right between invalidating the old entry and writing the new entry? One of the design goals should be to always have a consistent state in the flash chip. So for the case that there is enough free space in the SPI NOR flash, I'd write the new entry and invalidate the old entry after that.
Ack.
For the defragmentation at least one erase block needs to be reserved so that the new entries can be written before the old ones are invalidated.
Ack.
The easiest method of wear leveling would be to use the blocks as a ring buffer; sure this doesn't minimize the number of erase cycles, but since NOR flash is quite robust and new entries probably won't get written that often, I'd probably prefer this less complex approach to increase the code readability and maintainability.
I agree that we don't want any complex wear leveling. n+1 blocks erased for n blocks of data should be good enough. I also had some kind of ring buffer in mind. I will describe my current sketch in more detail below.
Read strategy: Either cache and index the whole thing, or read until the first valid entry with matching key.
The question I have here is what to do if there is more than one valid entry. Just using the first one introduces a bit of undetermined behavior (block order in the flash might be different after defragmentation).
No, always choosing the first would be determined behavior. In case we keep things in order it's also the better choice when it comes to sear- ching the current, valid entry (we'd have to always search the whole used space for a newer version otherwise). I'd treat it like this: If the old version wasn't invalidated, the update didn't finish. So we can ignore the new version.
So either use some counter (beware of overflows!) on either the flash erase block (small overhead and rather efficient) or the entry (full region has to be either cached or read every time and needs more bytes than the counter per erase block) or accept the possibility of undetermined behavior (IMHO not a good idea).
I'm not sure how to use a per erase-block counter? When would it update? we could only do that on every erase. A per variable counter might be useful, but I still hope it will work without.
Also a good idea would be to invalidate everything except the newest entry when reading in the case that there is more than one valid entry.
Might be done in a sanitation phase before each write. But if we define the first entry to be the valid one, we probably don't need it. Update: I didn't come around in my scheme below... maybe we should make the update in three steps: tag for update, insert new value, invalidate. This way we would only have to search for duplicates for valid entries tagged for update.
For boolean options you can also reserve a word of 32 or 64 bits and flip one bit for every toggle and test if the number of flipped bits is even or odd, so you don't have to write a new entry every time the option gets toggled. Also works for n-bit values, but I'm not sure if it's worth the more complex code.
How about optimizing every update (without a change in length)? If it's doable in place, just write the new value. Your bit flip could then be implemented in the next layer (keeping this one more focused).
Having some sort of key-value store that can be written to during runtime would also be great for replacing the CMOS user option table thing.
I think both could live side-by-side, the user could choose where to store settings.
Updated scheme: With n erase blocks, the first n-1 blocks would be num- bered 0..n-2. The last block would be the "spare block". The spare block has a header (or footer, with size <= size of an entry's header) that contains the number of another block (or 0xffff).
Updated write: Before each write, check if the spare block is marked with another blocks number. If it is, find the first block with free space and continue the defragmentation process at this point.
Defragmentation:
Find duplicate entries and invalidate the ones added later Find the first block m with invalid entries Write m into the spare block's footer Copy all valid entries from m into the spare block (it will fit, because there was at least one invalidated entry in m) For i in m .. n-3 Erase i For j in m+1 .. n-3 Copy all valid entries from j that fit into i, invalidating each! End For End For Add back all entries from spare block, invalidating each Erase the spare block
Updated read: If the spare block is marked 0xffff, read starting from block 0 until the first valid matching entry. If the spare block is marked with block m, search from block 0 to m-1, then in the spare block, then in blocks m to n-2.
Might work? it's too late here to judge ;) Nico
Nico Huber wrote:
wear leveling ring buffer sanitation phase optimizing every update Defragmentation
..
Might work?
Please don't invent a new format, don't reinvent JFFS2 and/or UBI..
Let's make CBFS work for this, please. Reusing the existing data structure is incredibly valuable, even if it means some overhead.
//Peter
Hi Nico!
The question I have here is what to do if there is more than one valid entry. Just using the first one introduces a bit of undetermined behavior (block order in the flash might be different after defragmentation).
No, always choosing the first would be determined behavior. In case we keep things in order it's also the better choice when it comes to sear- ching the current, valid entry (we'd have to always search the whole used space for a newer version otherwise). I'd treat it like this: If the old version wasn't invalidated, the update didn't finish. So we can ignore the new version.
I was assuming the ring buffer case here where the first entry in the raw flash address space doesn't have to be the oldest one.
So either use some counter (beware of overflows!) on either the flash erase block (small overhead and rather efficient) or the entry (full region has to be either cached or read every time and needs more bytes than the counter per erase block) or accept the possibility of undetermined behavior (IMHO not a good idea).
I'm not sure how to use a per erase-block counter? When would it update? we could only do that on every erase. A per variable counter might be useful, but I still hope it will work without.
I was thinking of putting a number in the first word of a block. Every time a new block is allocated, this value is written once. The value is the highest number already used plus 1. Also beware of wraparounds there.
Might be done in a sanitation phase before each write. But if we define the first entry to be the valid one, we probably don't need it. Update: I didn't come around in my scheme below... maybe we should make the update in three steps: tag for update, insert new value, invalidate. This way we would only have to search for duplicates for valid entries tagged for update.
Good idea.
Defragmentation:
Haven't really looked at and thought about your defragmentation phase idea; maybe I'll comment on that next week.
Regards Felix
Thank you very much for all your advice, especially for Nico, Felix, Peter
Thanks.
Regards, Melissa Yi
BIOS Lead Engineer Celestica(Shanghai) R&D Center, China
www.celestica.com Solid Partners, Flexible Solutions
2017-10-02 4:22 GMT+08:00 Felix Held felix-coreboot@felixheld.de:
Hi Nico!
The question I have here is what to do if there is more than one valid
entry. Just using the first one introduces a bit of undetermined behavior (block order in the flash might be different after defragmentation).
No, always choosing the first would be determined behavior. In case we keep things in order it's also the better choice when it comes to sear- ching the current, valid entry (we'd have to always search the whole used space for a newer version otherwise). I'd treat it like this: If the old version wasn't invalidated, the update didn't finish. So we can ignore the new version.
I was assuming the ring buffer case here where the first entry in the raw flash address space doesn't have to be the oldest one.
So either use some counter (beware of overflows!) on
either the flash erase block (small overhead and rather efficient) or the entry (full region has to be either cached or read every time and needs more bytes than the counter per erase block) or accept the possibility of undetermined behavior (IMHO not a good idea).
I'm not sure how to use a per erase-block counter? When would it update? we could only do that on every erase. A per variable counter might be useful, but I still hope it will work without.
I was thinking of putting a number in the first word of a block. Every time a new block is allocated, this value is written once. The value is the highest number already used plus 1. Also beware of wraparounds there.
Might be done in a sanitation phase before each write. But if we define
the first entry to be the valid one, we probably don't need it. Update: I didn't come around in my scheme below... maybe we should make the update in three steps: tag for update, insert new value, invalidate. This way we would only have to search for duplicates for valid entries tagged for update.
Good idea.
Defragmentation:
Haven't really looked at and thought about your defragmentation phase idea; maybe I'll comment on that next week.
Regards Felix
-- coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot
Hi Nico!
I'm not sure how to use a per erase-block counter? When would it update? we could only do that on every erase. A per variable counter might be useful, but I still hope it will work without.
I was thinking of putting a number in the first word of a block. Every time a new block is allocated, this value is written once. The value is the highest number already used plus 1. Also beware of wraparounds there.
Maybe a better option would be to have a maybe 64bit field in the header of every erase-block that gets gets erased with the whole block and every time blocks are erased an additional bit will get written in the other erase blocks that contain valid data. Your proposal to first mark an entry for update, then write the new entry and after that invalidate the old entry would solve this problem better though.
Updated scheme: With n erase blocks, the first n-1 blocks would be num- bered 0..n-2. The last block would be the "spare block". The spare block has a header (or footer, with size <= size of an entry's header) that contains the number of another block (or 0xffff).
I wouldn't use a dedicated spare block, since that will be the block that wears out first, since every time the same block is used. Just use one of the blocks in the ring buffer for that; that also eliminates a special case. Just make sure that always at least one erase-block isn't used for data.
Updated write: Before each write, check if the spare block is marked with another blocks number. If it is, find the first block with free space and continue the defragmentation process at this point.
Defragmentation:
Find duplicate entries and invalidate the ones added later
Hm, with using the older of the two entries, we make sure that we won't use an entry that isn't written completely. Good idea.
Find the first block m with invalid entries Write m into the spare block's footer Copy all valid entries from m into the spare block (it will fit, because there was at least one invalidated entry in m) For i in m .. n-3 Erase i For j in m+1 .. n-3 Copy all valid entries from j that fit into i, invalidating each! End For End For Add back all entries from spare block, invalidating each Erase the spare block
My proposal would be doing the normal mark, update, invalidate routine until only one erase block is left unused after the current write. If less than at least a full erase-block would be left after a write, a garbage collection/defragmentation run has to be done. In this run the non-invalidated entries from the first/oldest block get written to the last empty block. Then the valid data of the next block gets written to the space that might be left in the block where the contents of the block that was processed before the current source block were written to; if there isn't enough space for an entry, the next erased block will be used. With this the whole ring buffer will be defragmented. Sure, with another strategy the number of block erases could be further decreased, but the code should kept short and understandable; finding and fixing bugs that don't occur often isn't that fun.
To keep things simple, I'd also introduce the limitation that an entry mustn't be longer than the size of an erase-block.
Regards Felix