Nico Huber has posted comments on this change. ( https://review.coreboot.org/23840 )
Change subject: Add "gingerly" flashing mode for the unreliable ISP environments ......................................................................
Patch Set 2:
Um, bigger things like this should actually be discussed on the mailing list first, e.g. to avoid spending time on something that might not get merged. Doesn't matter now, you already wrote it.
Regarding your original problem, I guess it could be solved with some tuning of your hardware setup. Did you try to get the SoC into a better state? (reset preferred; though, as you already work with two masters on the SPI bus, did you try to boot the SoC up and halt the OS?).
About this implementation, when I read "gingerly" I expected something more sophisticated (e.g. some sort of collision avoidance, listening on the bus instead of jamming it). Though, I have nothing against the way you do it, I would call it brute-force. Also, if I call `flashrom --gingerly` I would expect something that is safer to use, but it kind of is the opposite, isn't it?
I don't like to clutter up spi_read_chunked() and spi_write_chunked(), most changes to them seem to be there only to avoid malloc() in your spi_rw_gingerly()? I would rather move it there.
And... this might be the biggest issue: the possible endless loop in spi_rw_gingerly(). For a mergeable solution, you'd have to put some kind of timeout there. And that is where the problems start... if it can fail, we have to handle the failure correctly, otherwise flashrom (in its current implementation) would probably fall back to erase the whole chip and make things worse.
Last but not least, why do it at the SPI level? Retries due to unre- liable connections should be handled at a higher level, IMO.