Hi Hatim,

On Tue, May 31, 2016 at 3:25 AM, Hatim Kanchwala <hatim@hatimak.me> wrote:
On Sunday 29 May 2016 02:40 PM, David Hendricks wrote:
Hi Hatim,
Interesting approach. It seems to work well for pretty printing, though
I am curious how this will translate into ranges. Do you have an example
for translating status_register_layout structs to a range of bytes
protected, for example 0x000000-0x1fffff?

Hi,

I have been researching and developing ideas on how to go about block protection. A lot of this is based on Chromium OS' implementation. Here are the two models I have.

Before diving into the models themselves, here's idea that both models share. We define a struct range (just like Chromium OS) and then define an array struct range range[32]. We use the BP bitfield as indexes into the array - so a combination of BP4=0, BP3=0, BP2=1, BP1=0, BP0=1 gives a bitfield 0x05 and corresponding range is given by range[0x05]={start, len}. (32 because we have seen a maximum of 5 block protect bits so that is 2^5 = 32 combinations.) The obvious advantage is that status_to_range operations will happen in O(1) time.

Interesting approach, but what about SEC/CMP/TB bits? Since those modify the start and length, we do not have a 1:1 mapping between block protect bits and range. Also it complicates the "don't care" cases.

Overall I don't think this optimization buys us much.
 
1. Block Protection Table
The driving idea behind this model is to have a representation of block protection table in flashrom very similar to what is given in datasheets (just like Chromium OS currently has). But if we are to use array of ranges indexed by bitfield as per above idea, then we need to process this representation first. I came up with the following algorithm for translating the table into the range array -

translate_table_to_range(table):
proc = []
done = []
for row in table
    proc.append(row)
while proc is not empty
    tmp = proc.pop(0)
    for i = 0:5
        if tmp.bp[i] is X
            new = tmp
            new.bp.replace(i, 0)
            proc.append(new)
            new.bp.replace(i, 1)
            proc.append(new)
            break
    if i is 5
        done.append(tmp)
for row in done
    index = row.bp4 << 4 | row.bp3 << 3 | row.bp1 << 1 |row.bp0
    range[index] = row.range

These were the few questions came up immediately -
- When do we call such a function in the code?
    Flashrom will always probe before performing any operation. So once the chip is found, we fetch the BP table from the struct flashchip, call this function, store the array of ranges, and then perform all future operations for that run using the array.
- Do we call it for all the chips?
    IMO, calling this for all chips is a bad idea and the above answer makes this redundant.
- How to handle corner cases?
    This needs to be dealt with still.

2. Array of Range
This model is very simple compared to #1 - instead of any such processing, we simply represent the BP table in the form of an array of ranges in the first place. This might make representing the table in flashrom more cumbersome.

Attached is a prototype of the two models (both implementations are combined in the same file). I have extensively annotated the code to better convey the models defined above. The output is available here http://paste.flashrom.org/view.php?id=2921. Chromium implementation defines two separate structs from CMP=0 and CMP=1. This case can be handled automatically.

Yeah, the different structs for CMP=0 and CMP=1 are annoying, I'll be happy to be rid of them.
 
IMO, #2 is better than #1 and we should go forward with #2. The additional processing step that needs to be taken in #1 before array of ranges is available is what made me favour #2. I would like to hear your take on all this, particularly use cases (I could come up with only one or two, so please tell me more). There's a lot of comments in the code, so please go through those as well.

I agree - #1 seems over-designed for this... I think #2 will be much simpler and will get the job done just as well.

A couple more ideas:
- Instead of describing the range in terms of bytes, maybe a range of blocks would be more appropriate. This will require computing the byte ranges as needed, but will allow you to more easily handle presence of a SEC bit.

- I'm still not convinced that keeping a member with the status register layout is necessary, and recommend keeping track of things like CMP, SEC, and TB bits in another form (sort of like what I started doing with the generic_modifier_bits), and rely on function pointers to get/set them.

- The ascending order of the bit_state members of the wp struct is counter-intuitive (and probably error-prone for the poor souls transcribing them) since datasheets describe them in descending order. I see why you did it, though, since it fits nicely with the way you describe the BP bits using the bit_state enum. That's why I used a single int value ;-) However, my approach did not work well for "don't care" bits.

If you agree that descending order is more intuitive for the user, then another approach you might try is use an array of bit_states (enum bit_state bp[MAX_BP_BITS]) and specify the number of block protect bits somewhere else in the struct (num_bp_bits). The user enters the BP bits as an array where we can use descending order as the convention - The code simply iterates from bp[0] to bp[num_bp_bits - 1].
 
I am also wondering if it is possible to get the best of both #1 and #2 - is there a way to pre-process the BP Table into the array of ranges representation at the time of compilation itself for all chips? This would eliminate the additional step at runtime, and would make representation in the code easier.

Personally, I had a lot of fun implementing #1 mainly because it was a direct materialization of the algorithm and data structures theory I had in my course. I had started out developing my own data structures but then stopped mid-way and realized I should be looking for standard solutions. Then I found sys/queue.h. So, that was a nice learning experience!

That's what makes GSoC great :-)



On Tue, May 31, 2016 at 3:25 AM, Hatim Kanchwala <hatim@hatimak.me> wrote:
On Sunday 29 May 2016 02:40 PM, David Hendricks wrote:
Hi Hatim,
Interesting approach. It seems to work well for pretty printing, though
I am curious how this will translate into ranges. Do you have an example
for translating status_register_layout structs to a range of bytes
protected, for example 0x000000-0x1fffff?

Hi,

I have been researching and developing ideas on how to go about block protection. A lot of this is based on Chromium OS' implementation. Here are the two models I have.

Before diving into the models themselves, here's idea that both models share. We define a struct range (just like Chromium OS) and then define an array struct range range[32]. We use the BP bitfield as indexes into the array - so a combination of BP4=0, BP3=0, BP2=1, BP1=0, BP0=1 gives a bitfield 0x05 and corresponding range is given by range[0x05]={start, len}. (32 because we have seen a maximum of 5 block protect bits so that is 2^5 = 32 combinations.) The obvious advantage is that status_to_range operations will happen in O(1) time. IMO, the use cases of reading the status register and translating BP bitfields into range are more than vice-versa. So IMHO having an array of ranges indexed by bitfield will certainly be more helpful. You will see that the two approaches differ in the implementation of this idea.

1. Block Protection Table
The driving idea behind this model is to have a representation of block protection table in flashrom very similar to what is given in datasheets (just like Chromium OS currently has). But if we are to use array of ranges indexed by bitfield as per above idea, then we need to process this representation first. I came up with the following algorithm for translating the table into the range array -

translate_table_to_range(table):
proc = []
done = []
for row in table
    proc.append(row)
while proc is not empty
    tmp = proc.pop(0)
    for i = 0:5
        if tmp.bp[i] is X
            new = tmp
            new.bp.replace(i, 0)
            proc.append(new)
            new.bp.replace(i, 1)
            proc.append(new)
            break
    if i is 5
        done.append(tmp)
for row in done
    index = row.bp4 << 4 | row.bp3 << 3 | row.bp1 << 1 |row.bp0
    range[index] = row.range

These were the few questions came up immediately -
- When do we call such a function in the code?
    Flashrom will always probe before performing any operation. So once the chip is found, we fetch the BP table from the struct flashchip, call this function, store the array of ranges, and then perform all future operations for that run using the array.
- Do we call it for all the chips?
    IMO, calling this for all chips is a bad idea and the above answer makes this redundant.
- How to handle corner cases?
    This needs to be dealt with still.

2. Array of Range
This model is very simple compared to #1 - instead of any such processing, we simply represent the BP table in the form of an array of ranges in the first place. This might make representing the table in flashrom more cumbersome.

Attached is a prototype of the two models (both implementations are combined in the same file). I have extensively annotated the code to better convey the models defined above. The output is available here http://paste.flashrom.org/view.php?id=2921. Chromium implementation defines two separate structs from CMP=0 and CMP=1. This case can be handled automatically.

IMO, #2 is better than #1 and we should go forward with #2. The additional processing step that needs to be taken in #1 before array of ranges is available is what made me favour #2. I would like to hear your take on all this, particularly use cases (I could come up with only one or two, so please tell me more). There's a lot of comments in the code, so please go through those as well.

I am also wondering if it is possible to get the best of both #1 and #2 - is there a way to pre-process the BP Table into the array of ranges representation at the time of compilation itself for all chips? This would eliminate the additional step at runtime, and would make representation in the code easier.

Personally, I had a lot of fun implementing #1 mainly because it was a direct materialization of the algorithm and data structures theory I had in my course. I had started out developing my own data structures but then stopped mid-way and realized I should be looking for standard solutions. Then I found sys/queue.h. So, that was a nice learning experience!

Looking forward to your comments and feedback. Thanks for your time and patience.

Bye,
Hatim