Attention is currently required from: Aarya, Anastasia Klimchuk, Carly Zlabek.
Vincent Fazio has posted comments on this change. ( https://review.coreboot.org/c/flashrom/+/79354?usp=email )
Change subject: erasure_layout: Remove redundant `verify_range` call from `erase_write` ......................................................................
Patch Set 1:
(1 comment)
Patchset:
PS1:
Carly, thank you so much for the patch! You are right, there is a redundant verification. […]
Anastasia,
Thanks for taking the time to look this over.
I will discuss this with Carly later today and try to schedule time to collect this data. Our company leverages flashrom pretty heavily and we're grateful for the project so are happy to contribute patches as we can.
I think I agree that it makes sense to drop that `check_erased_range` call as it should be duplicating the smaller range checks in 341-344 but I would like us to make sure that we're not missing verification coverage there due to some weird loop quirk. The other option is to just call `check_erased_range` once for the entire range instead of multiple times on smaller chunks to avoid the overhead of the function calls, the multiple mallocs, and the flash reads.. though it does require more contiguous memory to do it just once which may put pressure on memory constrained systems. We can try benchmarking both.
--
That said, I would like to make a case for why these proposed changes should still occur independent of the orphaned `check_erased_range` change.
When programming an empty chip (scenario 3 in the testing info), it's not the orphaned `check_erased_range` that causes problems as the chip is blank and we never enter that loop, it's the `verity_range` over the entire layout after the writes are performed. In our case, this causes a ~40% reduction in programming performance (140sec -> 200sec) which would affect our board yields compared to flashrom 1.2.
As mentioned in the commit message, `flashrom_flash_write` [already verifies writes when instructed](https://review.coreboot.org/plugins/gitiles/flashrom/+/refs/heads/main/flash...) and does so based on flags set in `flashctx` so the call here is both unavoidable (because it doesn't respect caller set flags) and redundant (because it duplicates what's already being done by `flashrom_flash_write`).
While it's possible to keep the `verify_range` check in `erase_write`, it doesn't seem ideal as `flashrom_flash_write` already has a more "global" view and appropriately handles both full chip verification and verification by layout. If we kept it here in `erase_write`, it would need to be guarded by `flashctx` flags and not overlap with any checks in `flashrom_flash_write`. It would also make the verification logic disjointed and more difficult to maintain as they would no longer be colocated in `flashrom_flash_write`.
--
As a side note, it could be argued that the erase path should also respect the verify flag(s), but I think that's a larger discussion to happen for a separate patchset and would be a change in behavior. [This FIXME here](https://review.coreboot.org/plugins/gitiles/flashrom/+/refs/heads/main/cli_c...) seems to imply there's not much benefit in announcing erase failures, so performing verification seems like additional overhead for that operation, but I think it make senses to make any potentially destructive APIs, such as erase and write, consistent with respect to verification. So either they respect the verify flags in `flashctx` or API users, including the CLI, should independently invoke verification after calling erase/write.
Currently (without this patch), for testing scenario 1 (overwriting an image with a new image) where we're erasing _and_ writing, I think we're verifying data 4 times:
1) `check_erased_range` unconditionally on small chunks 2) `check_erased_range` unconditionally on the entire erase block 3) `verify_range` unconditionally in `erase_write` after writing data 4) the verification flagged to be performed in `flashrom_flash_write`