Hi list,
We (Angel) sense that this thread might potentially drift towards an unpleasant direction and we think the outcome wouldn't be forward-useful, i.e. it would make things worse. How are all of you doing? Is anyone feeling hurt?
On Sat, Oct 15, 2022 at 11:53 AM Nico Huber nico.h@gmx.de wrote:
Hi all,
On 14.10.22 03:53, Anastasia Klimchuk wrote:
A quick answer first: We are closer to the release, and the state of the master is better than a half year ago.
hmmm, I may not have made myself clear with my question. I didn't mean to ask for an overall state but wrt. stability (iow. absence of code flaws) in particular. And if/how we can trust that there are no or few enough flaws.
In order to make a release, stability (iow. absence of code flaws) is more important than having more features. So, we (Angel) would suggest prioritizing bug fixes over new features, at least to get the release done. Another option would be to branch off current master, radically deal with existing bugs (e.g. delete the problematic sb600spi code, revert the progress state patches) and use this branch for the flashrom v1.3 release. Then we can decide how to proceed for future releases without the pressure of not having done a release in a long time.
More details. In general, the progress is visible here https://ticket.coreboot.org/issues/353
Yes, this is the progress of tickets. But how to put this... I know the state as far as we measure it. What I'm concerned about is what we don't measure.
Do you have any examples of things that we do not measure? Hmmm, looks like they're explained below.
Maybe this helps to understand the idea of the release process better: In the Development Guidelines[1] it says
"Branching for a new release can happen at any point in time when a commit (branch point) on master seems to be in good shape and was reasonably tested after previous invasive changes."
The last part tries to protect us from very new regressions / issues that nobody found (measured) yet. And this was written in a time when flashrom builds from the master branch were much more tested. We could tell people that it's safe to use the master branch (which it isn't anymore). And even the people who used releases implicitly tested a state of the code for us that was much closer to master.
In other words, it's best to submit new features and potentially breaking changes after a release. Otherwise, things in a release don't get tested properly. On the coreboot side, we had to make the 4.8.1 release because 4.8 was broken: a small patch got submitted just before the 4.8 release that broke loading several payloads, and people who tried the 4.8 release quickly noticed that it did not boot...
Also, what led to the long list in #353 was a development process that created issues faster than they were fixed. Many of the issues were only made visible by a re-review. Can we say that the ratio of creating and fixing issues is more balanced now? How can we trust that if we'd re-review again, we wouldn't end up with an even longer list?
We (Angel) found several patches with bugs after a re-review or a post-submit review in the past. Some of these bugs could have been oversights, but other bugs were too conspicuous to be accidental oversights. It felt as if those reviews (mainly those for CrOS flashrom upstreaming patches) were heavily expedited for some reason that we (Angel, and possibly the upstream flashrom community) don't know about. If anyone knows, we'd appreciate if they could help understand why things happened this way, especially if they can shed some light regarding what happened inside Google at that time.
There is one issue that came up around May https://ticket.coreboot.org/issues/390 . This one is new. It does not prevent from using any functionality, and also if the user does not specify `--progress` they won’t even notice.
Can the latter be proved? I'd say yes, but it seems like more effort then re-writing the whole patch. I never fully reviewed the patch but in the few places I looked into I saw wrong calculations and overflows. And those run independently from the `--progress` switch. And please remember we are talking about C code. To say that it doesn't affect existing functionality would require a proof of defined behavior. Because if there is undefined behavior, flashrom could just crash.
Also worth to remember: flashrom often runs with root privileges. So every coding mistake is a security issue unless we can prove that it isn't.
We fully agree with Nico's view. We hadn't thought about the safety and security implications of such changes. Hmmm, maybe we should rewrite flashrom in at-least-Ada then? :D
Lots of code improvements: for example we have way less global state than it used to be 1/2yr ago, and we have more unit tests of all types.
The former is in the category of "invasive changes" that I'd like to avoid before a release. I see a lot of refactorings in the name of less global state. IMO that's a nice long-term goal, but shouldn't stall the project. Just very recently, there was a patch train that unnecessarily introduced a lot of NULL pointers (I commented how to do it without and that we'd have to be careful otherwise). Things were merged (without being careful), reverted, fixed up (I warned about NULL pointers again), merged again (again without looking for NULL pointers left), fixed up again. How can we know now that it's settled? How to trust such a development process?
(This case might actually be why I started to wonder how the measured progress relates to the actual progress on the master branch.)
Ah yes, that was not good... https://review.coreboot.org/66659 caused issues, https://review.coreboot.org/67621 reverted CB:66659, then https://review.coreboot.org/67094 and https://review.coreboot.org/67649 tried to do the same as CB:66659 but better, but still ended up causing problems that https://review.coreboot.org/67752 had to fix. We (Angel) are not convinced that there won't be any more issues with NULL pointers. The reason? The way https://review.coreboot.org/q/topic:programmer_init_param_globals did things is particularly error-prone. https://review.coreboot.org/66654 adds NULL pointers in a bunch of places, and it's very easy to lose track of them. Nico had already suggested an approach to avoid having to deal with NULL pointers: make the patches introducing `programmer_cfg` in reverse order. Roughly, this would mean starting with https://review.coreboot.org/67649 and ending with https://review.coreboot.org/66654 instead, so that `programmer_cfg` pointers are valid from the very beginning.
Can we (everyone reading this) learn from this case? We (Angel) think so: this can be avoided by reviewing the *design* and/or "plan of attack" (the "enemy" being "anything in flashrom that can be improved" (let's not attack others please)) in advance. This can be done by discussing the plans in flashrom's communication channels (e.g. our mailing list or the #flashrom IRC channel on Libera), or in the form of RFC (Request For Comments) changes (although authors of RFC changes *need* to be OK with having to completely redesign the implementation).
If you ask me to pick what’s very important to fix I would say https://ticket.coreboot.org/issues/370 I remember several patches on that, from different people. Potentially, the bug is [technically] fixed, but we need to coordinate who is rebasing on the top of whom, and complete the review.
The story is much more complex. Every time somebody looks into it, more even older issues pop up. I'd say revert to the state of the last release, and then after a new release, fix remaining issues and only then add new things and the refactorings again. IMO this driver was patched to a dead end, or at least a state where the easiest way for- ward is to go a few steps back first. Last time Edward drew attention to it, I really tried to find a solution. But even comparing the state right after the problematic patch to the current state was too frus- trating: Due to all the refactorings I couldn't see (within reasonable time) what changed in terms of functionality.
At least to us (Angel), what needs to be done to address #370 is not clear at all. Yes, sb600spi is buggy, but how should we proceed? We (Angel) believe that the most straightforward to deal with the sb600spi mess would be to revert the buggy features, sanitize the codebase (make sure refactoring patches didn't introduce any other regressions, fix any issues with the original codebase), and only then consider reintroducing the new features.
Maybe we should have a rule for this? "If there are any known issues within a file or area of the code, no refactoring must take place until they are fixed."?
For potentially harmful issues, like the risk of soft bricking machines with >16 MiB flash, definitely.
Nico
[1] https://www.flashrom.org/Development_Guidelines#Release_branches_(e.g._1.0.x) _______________________________________________ flashrom mailing list -- flashrom@flashrom.org To unsubscribe send an email to flashrom-leave@flashrom.org
On Sun, Oct 16, 2022 at 9:41 PM Anastasia Klimchuk aklm@chromium.org wrote:
Nico what was the goal why you started this thread?
I thought for a moment, you have a *quick* question. So I answered.
It could be that the question itself is quick, but its answer is not. Or it might be a way to introduce the question to the readers. We (Angel) introduce some of our questions with the self-deprecating "Stupid question" preamble, even if we're sure that the question itself is not at all stupid. Using "Stupid question" to preface (some of) our questions inspires others to critically assess what we're asking, which can help them put aside any grudges they might have against us. We also use this because we love humor: humor is one of the healthiest ways to cope with the trials and tribulations of life.
Now you seem to be unhappy to learn that we are closer to the release than 1/2 yr ago.
Are we actually closer to making a release? It's interesting that #353 is classified as a feature; we (Angel) would believe that not making a new release in over 2 years is more of a bug than a feature... There's also several refactoring tasks in #353 which seem to be orthogonal to the problem of "release flashrom v1.3 someday". We probably said earlier in this email that it would be a good idea to try to release v1.3 without the more invasive/controversial features and refactorings, not only from a technical perspective (users get to use a newer flashrom version with some important bug fixes, e.g. the emergency help text directs them to the correct IRC server) but also from an emotional perspective (not having to worry about the pressure of not having done any releases in over two years). This doesn't mean that these features will be scrapped; rather, we will be able to work on them much more effectively. It's just another form of "divide and conquer"!
You are saying "But I don't believe you", what was the point of asking then?
Anastasia, the following paragraph is specifically for you. Apologies if it makes you feel attacked in any way: we don't want to hurt you, but us (Angel) having autism makes it harder to understand people outside the spectrum.
You know that, a few days ago, we (Angel) finally managed to understand what really happened w.r.t. flashrom in the last two years or so. The only reason why we were able to do so is because we got rid of all prejudices we had regarding Google people, especially Edward. It was only after clearing those nasty prejudice-clouds that we were able to "see" what actually happened, and how far from reality our prejudices were. Although we do not recommend caustically ranting and raving on Gerrit (like we did...), we'd strongly encourage you to speak out your feelings to others (either publicly or privately) as a means of catharsis: to bring any buried grudges and other unpleasant feelings into your consciousness and release them permanently. In a way, the idea is to do something similar to Marie Kondo's https://en.wikipedia.org/wiki/Marie_Kondo#KonMari_method but applied to "things" in one's mind: quickly and completely discard whatever that doesn't "spark joy", and focus on the things that "spark joy". Even though we (Angel) are not licensed therapists, we are willing to listen to you.
-- Anastasia.
Best regards, Angel