Angel Pons has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/43484 )
Change subject: Documentation: Revise "24 hours wait period" rules ......................................................................
Patch Set 2:
(6 comments)
https://review.coreboot.org/c/coreboot/+/43484/1/Documentation/getting_start... File Documentation/getting_started/gerrit_guidelines.md:
https://review.coreboot.org/c/coreboot/+/43484/1/Documentation/getting_start... PS1, Line 64: TIMELESS
Possible, yes. Lots of work though. Not in scope anytime soon.
One could store the artifacts' SHA256 hashes for each commit somewhere, then compare. The problem is that the amount of stored hashes could grow up really fast.
As we build-test each submitted change now, maybe we could start off by assessing whether each submitted commit is reproducible? More or less like the 9elements automated testing bot, we could print out which boards are reproducible and which are not.
https://review.coreboot.org/c/coreboot/+/43484/1/Documentation/getting_start... PS1, Line 66: * Do not +2 patches that you authored or own
Unrelated to this patch, but I think this section could be removed now. […]
I'd still keep it: it's the reason why self-approval is ignored by gerrit.
https://review.coreboot.org/c/coreboot/+/43484/2/Documentation/getting_start... File Documentation/getting_started/gerrit_guidelines.md:
https://review.coreboot.org/c/coreboot/+/43484/2/Documentation/getting_start... PS2, Line 53: recently introduced works as an adjective: recently-introduced
https://review.coreboot.org/c/coreboot/+/43484/2/Documentation/getting_start... PS2, Line 56: The change itself should be as limited in scope : and impact as possible to make it simple to assess the impact. I just thought about it for a while, and I noticed a certain pattern: fixup commits work best for build breakage, whereas reverts are better-suited for boot issues. I think it's worth mentioning this idea here. Sorry for the wall of text.
# Build failures
Since we build-test each patchset only once before submitting it, build breakage mainly happens when patch trains are submitted out-of-order, or when change A renames something and change B uses it on a completely different file, but aren't in the same patch train.
When building the tree breaks, pretty much *all* changes will fail to build. This ends up stalling everyone's workflow, which is really bad. However, it's easy to check if a change works to solve the problem: if it builds, it's good enough. Furthermore, the commits which introduced the problem may be large, so reverting them is often regarded as too invasive of a solution. So, making a small fixup change is the usual course of action.
# Boot failures
On the other hand, boot failures are (as of now) meaningless on Gerrit. That is, if e.g. Pineview boards stop booting for some reason, there's no way one can tell unless they build an affected coreboot version, flash it onto hardware and end up with a brick. Moreover, if the breakage happened because of a change to common code, it's nearly impossible to know if other platforms are affected or not, and if so to which extent.
That boot-testing isn't readily available makes fixups hard to justify. Will all platforms be able to boot with said fixup? It could happen that a fixup restores booting on a certain platform, but breaks another platform which was not initially affected. It might be unlikely, but it could happen. And since hardware is made out of cursed sand, no one really knows!
Despite the uncertain nature of boot failures, it's reasonable to assume the following: if a given platform stopped booting after a certain change was submitted, and reverting said change makes the affected platform able to boot again, the other platforms will either be affected in the same way (broke and got fixed) or not affected at all. Creating proper commits is important: when each commit does one thing, reverting the bad commit is easy.
# Examples
For example, when my huge SPDX commits cracked a jackpot and broke building master, Nico proposed CB:40140 to quickly fix things, and after thinking for a while CB:40240 was accepted as a definite solution. Reverting those SPDX changes was out of the question, as they were good to have, and it was easier to fix the `dead_code` macro.
When the allocator v4 changes broke master not too long after the 4.12 release, CB:41363 was proposed as a fixup. It fixed some platforms, but not all; this prompted for some more fixes like CB:41364 and similar. While the platform-specific fixups were eventually submitted, the first fixup was abandoned in favor of reverting everything and re-adding allocator v4 alongside v3.
Also, CB:42219 broke several things, and then CB:42434 was intended to be a fixup for the problem. I did not like that a fixup without much explanation was being pushed in using this rule, and I created CB:42166 to propose a revert. It went in much more easily.
https://review.coreboot.org/c/coreboot/+/43484/2/Documentation/getting_start... PS2, Line 61: * Trivial changes that deal with minor issues like inconsistencies in Mention that such changes need to be XS? (or maybe S, but that would be a bit of a stretch)
https://review.coreboot.org/c/coreboot/+/43484/2/Documentation/getting_start... PS2, Line 63: should point out in their commit : messages that they're trivial I've got mixed feelings about this. If a change is trivial because it meets the other criteria, but the author didn't state it is trivial, then it needs to wait 24h?