Hi,
Regarding block protection scheme, I am very glad to say I have had a breakthrough observation. After having sifted through around 90 datasheets, I was able to spot a pattern that majority of chips follow. GigaDevice and Winbond , along with some AMIC, Eon, Fudan and Macronix chips follow this pattern - around 50 in number. With this pattern, we can now dynamically generate the entire block protection ranges on-the-go. We don't need to hard-code the tables for these chips anymore and presence (or absence) of SEC, TB, or CMP bits is also handles generically! The exact algorithm is slightly convoluted, but I'll try my best to explain it. Feel free to skip past #5, you will have nothing much to lose. For all practical purposes, the BP4 and BP3 bits are identical to SEC and TB bits respectively. We'll call tuple of (BP3, BP2, BP1) as BP, and X is don't care. 1. For BP=(0,0,0): No protection, SEC=X, TB=X
2. For BP=(1,1,1): All protection, SEC=X, TB=X
3. For (SEC,TB)=(0,0): We introduce a parameter f, such that f = chip_size(in kB) / 64 for chips_size < 4096 kB; and f = 64 for chips_size >= 4096 kB. 3.1 For chip_size >= 4096 kB, we have the following correspondence of fraction of chip_size with BP: (0,0,1) - 1/64 (0,1,0) - 1/32 (0,1,1) - 1/16 ... (1,1,0) - 1/2 3.2 For chip_size = 2048 kB, we have the following correspondence of fraction of chip_size with BP: (0,0,1) - 1/32 (0,1,0) - 1/16 (0,1,1) - 1/8 ... (1,1,0) - 1 3.3 For chip_size = 1024 kB, we have the following correspondence of fraction of chip_size with BP: (0,0,1) - 1/16 (0,1,0) - 1/8 (0,1,1) - 1/4 ... (1,1,0) - 1 3.4 For chip_size = 512 kB, we have the following correspondence of fraction of chip_size with BP: (0,0,1) - 1/8 (0,1,0) - 1/4 ... (1,1,0) - 1 3.5 For chip_size = 256 kB, we have the following correspondence of fraction of chip_size with BP: (0,0,1) - 1/4 (0,1,0) - 1/2 (0,1,1) - 1 (1,0,0) - 0 (1,0,1) - 1/4 (1,1,0) - 1/2 3.6 For chip_size = 128 kB, we have the following correspondence of fraction of chip_size with BP: (0,0,1) - 1/2 (0,1,0) - 1 (0,1,1) - 1 (1,0,0) - 0 (1,0,1) - 1/2 (1,1,0) - 1 (SEC,TB)=(0,0) represents the upper portion of the chip (which means range addresses always end at top boundary). E.g. For a 4096 kB chip, (0,1,0) 1/32 means upper 256 kB - 0x3fff00-0x3fffff
4. For (SEC,TB)=(0,1): This represents the lower portion of the chip (which means range addresses always start at lower boundary). The same mapping defined earlier for fraction of chip_size holds. E.g. For a 512 kB chip, (0,1,0) means lower 128 kB - 0x000000-0x01ffff
5. For (SEC,TB)=(1,X): (0,0,1) - 4 kB (0,1,0) - 8 kB (0,1,1) - 16 kB (1,0,0) - 32 kB (1,0,1) - 32 kB *(1,1,0) - 32 kB We have upper for TB=0, lower for TB=1 E.g. For a 1024 kB chip TB=1, (1,0,0) means lower 32 kB - 0x000000-0x007fff *Some chips have 64 kB block instead of 32 kB.
(You will not lose much if you skip up to here)
6. For chips with CMP bit: For all chips that I have seen (and that is well over a 100 SPI chips), the block protection range either starts at lower boundary of chip or ends at upper boundary of chip. So, given the range and size of the chip, we can compute the complement range.
Please find attached pattern.c. It is an excerpt from the unmerged code I am working on and contains the implementation of the discussed model (cf. sec_block_pattern_range_generic). There are other functions and structs to show the models impact on the infrastructure. IMO, this is an immense improvement from what we have in ChromiumOS fork and what we have been discussing. Here are the key takeaway points - - Having the status register layout member in struct flashchip is immensely beneficial. A lot of chips can be supported with the generic code, perhaps with only slight adaptation. - Based on David's suggestion, shifting to function pointers was a good choice - it allows flexibility. - Presence (or absence) of CMP, TB (BP3), SEC (BP4) bit is automatically taken of, and the corresponding range is also dynamically generated. - For chips whose ranges cannot be dynamically generated, we have the option of specifying range manually. I think it is best to stick to range for such representation, instead of range of sectors/blocks or portions of chip (as suggested by David in the previous mail). Sticking with range is safe because range is independent of sector/block and will yield the proper result. - Based on Stefan's suggestion some time ago, having pointers to struct status_register and struct wp will allow greater re-use (which is happening a lot!)
Please have a look at the source file. The output of the source is here (http://paste.flashrom.org/view.php?id=2940)
Further work - - In the current implementation, for chips whose range is dynamically generated we have a pointer to the range and an ID variable to associate the range with the chip (this is so that range is computed once and reused for subsequent calls). If we have another chip then the ID and pointer gets overwritten (like is happening in main() in the source). So when we want to return to the previous chip, we have to again spend the computing cost. This can be enhanced if we use list data structures and append newly computed range tables. That way, we can always revisit a previously computed table at a nominal computing cost. - The computation of BP bitfield from a given range can be enhanced if we factor in CMP bit (if present) - there could be a scenario where CMP=0 and the given range is not possible, but if CMP=1 the range could be possible. - Another way we could enhance it is through suggesting nearest neighbours to the range. (I find this idea theoretically interesting, but practically I cannot come up with use cases)
I hope you enjoyed going through the source. It took me a while to sift through so many datasheets and try out varying models, but I was elated with joy when the above algorithm was ready. I am looking forward to your feedback and comments on this.
Thanks for your time and patience.
Bye, Hatim
On Fri, Jun 17, 2016 at 6:32 PM, Hatim Kanchwala hatim@hatimak.me wrote:
Hi,
Regarding block protection scheme, I am very glad to say I have had a breakthrough observation. After having sifted through around 90 datasheets, I was able to spot a pattern that majority of chips follow. GigaDevice and Winbond , along with some AMIC, Eon, Fudan and Macronix chips follow this pattern - around 50 in number. With this pattern, we can now dynamically generate the entire block protection ranges on-the-go. We don't need to hard-code the tables for these chips anymore and presence (or absence) of SEC, TB, or CMP bits is also handles generically! The exact algorithm is slightly convoluted, but I'll try my best to explain it. Feel free to skip past #5, you will have nothing much to lose. For all practical purposes, the BP4 and BP3 bits are identical to SEC and TB bits respectively. We'll call tuple of (BP3, BP2, BP1) as BP, and X is don't care.
For BP=(0,0,0): No protection, SEC=X, TB=X
For BP=(1,1,1): All protection, SEC=X, TB=X
For (SEC,TB)=(0,0): We introduce a parameter f, such that f = chip_size(in kB) / 64 for
chips_size < 4096 kB; and f = 64 for chips_size >= 4096 kB. 3.1 For chip_size >= 4096 kB, we have the following correspondence of fraction of chip_size with BP: (0,0,1) - 1/64 (0,1,0) - 1/32 (0,1,1) - 1/16 ... (1,1,0) - 1/2 3.2 For chip_size = 2048 kB, we have the following correspondence of fraction of chip_size with BP: (0,0,1) - 1/32 (0,1,0) - 1/16 (0,1,1) - 1/8 ... (1,1,0) - 1 3.3 For chip_size = 1024 kB, we have the following correspondence of fraction of chip_size with BP: (0,0,1) - 1/16 (0,1,0) - 1/8 (0,1,1) - 1/4 ... (1,1,0) - 1 3.4 For chip_size = 512 kB, we have the following correspondence of fraction of chip_size with BP: (0,0,1) - 1/8 (0,1,0) - 1/4 ... (1,1,0) - 1 3.5 For chip_size = 256 kB, we have the following correspondence of fraction of chip_size with BP: (0,0,1) - 1/4 (0,1,0) - 1/2 (0,1,1) - 1 (1,0,0) - 0 (1,0,1) - 1/4 (1,1,0) - 1/2 3.6 For chip_size = 128 kB, we have the following correspondence of fraction of chip_size with BP: (0,0,1) - 1/2 (0,1,0) - 1 (0,1,1) - 1 (1,0,0) - 0 (1,0,1) - 1/2 (1,1,0) - 1 (SEC,TB)=(0,0) represents the upper portion of the chip (which means range addresses always end at top boundary). E.g. For a 4096 kB chip, (0,1,0) 1/32 means upper 256 kB - 0x3fff00-0x3fffff
- For (SEC,TB)=(0,1): This represents the lower portion of the chip (which means range
addresses always start at lower boundary). The same mapping defined earlier for fraction of chip_size holds. E.g. For a 512 kB chip, (0,1,0) means lower 128 kB - 0x000000-0x01ffff
- For (SEC,TB)=(1,X): (0,0,1) - 4 kB (0,1,0) - 8 kB (0,1,1) - 16 kB (1,0,0) - 32 kB (1,0,1) - 32 kB *(1,1,0) - 32 kB
We have upper for TB=0, lower for TB=1 E.g. For a 1024 kB chip TB=1, (1,0,0) means lower 32 kB - 0x000000-0x007fff *Some chips have 64 kB block instead of 32 kB.
(You will not lose much if you skip up to here)
- For chips with CMP bit: For all chips that I have seen (and that is well over a 100 SPI
chips), the block protection range either starts at lower boundary of chip or ends at upper boundary of chip. So, given the range and size of the chip, we can compute the complement range.
Please find attached pattern.c. It is an excerpt from the unmerged code I am working on and contains the implementation of the discussed model (cf. sec_block_pattern_range_generic). There are other functions and structs to show the models impact on the infrastructure. IMO, this is an immense improvement from what we have in ChromiumOS fork and what we have been discussing. Here are the key takeaway points -
- Having the status register layout member in struct flashchip is
immensely beneficial. A lot of chips can be supported with the generic code, perhaps with only slight adaptation.
- Based on David's suggestion, shifting to function pointers was a good
choice - it allows flexibility.
- Presence (or absence) of CMP, TB (BP3), SEC (BP4) bit is automatically
taken of, and the corresponding range is also dynamically generated.
- For chips whose ranges cannot be dynamically generated, we have the
option of specifying range manually. I think it is best to stick to range for such representation, instead of range of sectors/blocks or portions of chip (as suggested by David in the previous mail). Sticking with range is safe because range is independent of sector/block and will yield the proper result.
- Based on Stefan's suggestion some time ago, having pointers to struct
status_register and struct wp will allow greater re-use (which is happening a lot!)
Sounds pretty good so far. Indeed, many chips are very similar in their block protect schema so generating generic tables is a reasonable approach, provided we can override the generic method when an exception to the rule is encountered.
Most users will probably be satisfied using generated tables, even if chips support more combinations, for example due to "don't care" bits. So I imagine that although the generated tables might not be complete, they should be sufficient. The user needs to run "--wp-list" to know what ranges are supported, and if the generated tables are not adequate then they need to add a table and override it.
Please have a look at the source file. The output of the source is here ( http://paste.flashrom.org/view.php?id=2940)
Looks a lot better than before!
A few more suggestions:
- You should probably just add a num_ranges member to the wp struct rather than having LEN_RANGES. Large chips tend have many more supported ranges than smaller ones - A 128Mbit chip with 4 BP bits, TB, SEC, and CMP could exceed 100 possible ranges. - I don't see why we want get_cmp() and set_cmp() function pointers in the wp struct. It seems to be handled already after SEC and TB are processed in sec_block_pattern_range_generic()? - The computation, especially, sec_block_multiples, seems a bit awkward... Or maybe I just need time to study it more. I think it would be more straight-forward to describe this in terms of the minimum granularity (or "density" in some datasheets), for example, 64KB * 2^0 if BP3-0 is 0 0 0 1, 64KB * 2^1 if BP3-0 is 0 0 1 0, etc. So it works out to something like granularity * 2^(bp_val - 1).
Further work -
- In the current implementation, for chips whose range is dynamically
generated we have a pointer to the range and an ID variable to associate the range with the chip (this is so that range is computed once and reused for subsequent calls). If we have another chip then the ID and pointer gets overwritten (like is happening in main() in the source). So when we want to return to the previous chip, we have to again spend the computing cost. This can be enhanced if we use list data structures and append newly computed range tables. That way, we can always revisit a previously computed table at a nominal computing cost.
I'm not sure if this is worth the effort, but if it's not too much extra work then perhaps it can be useful. I'm having difficulty thinking of use cases where we'll need a computed table to be used multiple times in a single invocation of flashrom.
- The computation of BP bitfield from a given range can be enhanced if we
factor in CMP bit (if present) - there could be a scenario where CMP=0 and the given range is not possible, but if CMP=1 the range could be possible.
Yep.
- Another way we could enhance it is through suggesting nearest neighbours
to the range. (I find this idea theoretically interesting, but practically I cannot come up with use cases)
Interesting, but as you point out this is probably unnecessary.
I hope you enjoyed going through the source. It took me a while to sift through so many datasheets and try out varying models, but I was elated with joy when the above algorithm was ready. I am looking forward to your feedback and comments on this.
Thanks for your time and patience.
Bye, Hatim