Anastasia Klimchuk has submitted this change. ( https://review.coreboot.org/c/flashrom/+/80807?usp=email )
Change subject: flashrom: Don't throw around "delay 1 second" so lightly ......................................................................
flashrom: Don't throw around "delay 1 second" so lightly
Waiting a full second is a very long time, especially when default_delay() chooses to busy-loop. This code has been around for a decade, with vague references to user reports:
commit 8ab49e72af8465d4527de2ec37b22cd44f7a1169 Date: Wed Aug 19 13:55:34 2009 +0000 Disallow erase/write for known bad chips so people won't try without a clear understanding
Still, this logic does not belong in the high-level library logic, used by all programmers and all chips. If there is a timing issue, it should either be encoded in the appropriate programmer or flashchip timing. However, we don't really know what chips were in use, as the above commit doesn't have any links to reports. So in a feeble attempt at avoiding breaking users here, we also surmise that...
* SPI chips weren't all that common in 2009; * I'm mostly motivated by flashrom performance on Chromebooks, were SPI chips (and linux_mtd / BUS_PROG) are the rule; and * SPI chips have precise timing requirements and an appropriate BUSY status. So we guess that this "calm down" magic delay wouldn't be necessary there.
Thus, we allow this magic delay only on non-SPI (and non-BUG_PROG, used by linux_mtd for one) buses as a compromise.
Now, this change has some (hopefully [1] tiny) chance of regression, so we have the following considerations:
1. emergency_help_message() already provides documentation on how to contact support, in case we need to handle any user-reported regressions. 2. If there is any regression here, it's only in the --verify code; so we can always provide workarounds for testing this, to determine whether this change may have been at fault. For example, something like:
flashrom --write /my/new/image.bin --noverify sleep 1 flashrom --read /tmp/bar.bin cmp /my/new/image.bin /tmp/bar.bin
If such problems occur, we can collect system/programmer/chip info to try to encode a more targeted delay into the appropriate chip/programmer implementation, and avoid penalizing the entire project like this. 3. We already have (embedded in erase_write()) erase verification that performs no such delay. So depending on the type of timing error that this delay was attempting to cover, we may have some proof that this delay is no longer necessary (or at least, that whatever systems were needing this delay in the first place are no longer caring about flashrom). 4. We've retained the delay for buses that were likely common in 2009 (per the above "feeble attempt").
NB: I avoid using the BUS_NONSPI macro, because I want to exclude any future buses from this workaround, even in the event that the BUS_NONSPI category grows in the future.
[1] Famous last words.
BUG=b:325539928 TEST=`flashrom_tester --flashrom_binary=$(which flashrom) \ internal Erase_and_Write Fail_to_verify`, TEST=`vpd -i RW_VPD -s foo=bar; vpd -i RW_VPD -l; \ vpd -i RW_VPD -d foo; vpd -i RW_VPD -l` TEST=`elogtool list; elogtool add 0xa7; elogtool list`
on (at least) 2 systems: #1: Kukui/Kakadu rev2 - MTD programmer / kernel 5.10.215-24542-g0515a679eb42 / CrOS ~ 15857.0.0 #2: Zork/Dirinboz rev2 - chip name: vendor="Winbond" name="W25Q128.JW.DTR" / BIOS: Google_Dirinboz.13434.688.0 / kernel 5.4.267-21940-g67f70e251a74 / CrOS ~ 15753.43.0
Change-Id: Ie09651fede3f9f03425244c94a2da8bae00315fc Signed-off-by: Brian Norris briannorris@chromium.org Reviewed-on: https://review.coreboot.org/c/flashrom/+/80807 Reviewed-by: Peter Marheine pmarheine@chromium.org Reviewed-by: Anastasia Klimchuk aklm@chromium.org Tested-by: build bot (Jenkins) no-reply@coreboot.org --- M flashrom.c 1 file changed, 25 insertions(+), 2 deletions(-)
Approvals: build bot (Jenkins): Verified Anastasia Klimchuk: Looks good to me, approved Peter Marheine: Looks good to me, but someone else must approve
diff --git a/flashrom.c b/flashrom.c index 630c69d..6b15ee5 100644 --- a/flashrom.c +++ b/flashrom.c @@ -2330,8 +2330,31 @@ if (verify && !all_skipped) { msg_cinfo("Verifying flash... ");
- /* Work around chips which need some time to calm down. */ - programmer_delay(flashctx, 1000*1000); + /* + * Work around chips which "need some time to calm down." + * + * Frankly, it's not 100% clear why this delay is here at all, + * except for a terse message from 2009 of "a few reports where + * verify directly after erase had unpleasant side effects like + * corrupting flash or at least getting incorrect verify + * results". Ideally, if there were a few known problematic + * chips or programmers, we could add quirks flags for those + * specific implementations without penalizing all other + * flashrom users. But alas, we don't know which systems + * experienced those issues. + * + * Out of an extreme abundance of caution, we retain this + * delay, but only for a few non-SPI bus types that were the + * likely prevalent targets at the time. This is a complete + * guess, which conveniently avoids wasting time on common + * BUS_SPI and BUS_PROG systems. + * + * Background thread: + * Subject: RFC: removing 1 second verification delay + * https://mail.coreboot.org/hyperkitty/list/flashrom@flashrom.org/thread/SFV3O... + */ + if (flashctx->chip->bustype & (BUS_PARALLEL | BUS_LPC | BUS_FWH)) + programmer_delay(flashctx, 1000*1000);
if (verify_all) combine_image_by_layout(flashctx, newcontents, oldcontents);