fallback/normal was a fine solution for pre-CBFS images. I'm not convinced it still is.
In that case what you're proposing is moving to fallback-only. That's one more step, right?
A safe update method is the main use case for fallback/normal I've heard about so far.
Another use case is hardware (an FPGA in my case) that might be misconfigured. If it is, the boot fails several times and uses the fallback.
It seems like both Ron and Stefan thought the dual image solution was necessary at one point. I don't remember their rationale.
With CBFS, updating an image is harder than before. pre-CBFS, you could simply dd(1) the old and the new image in some ways and get a new image with a new normal image and an old fallback image. With CBFS, you don't have such a clean alignment of the data inside the image, so you'll have to extract the pieces and add them to the other image.
That's fine as long as cbfstool works.
The other thing fallback/normal is used for is CMOS-aware code with recovery if CMOS is invalid. But it's flawed as it is: When the "normal image" bit is set correctly, and everything else is wrong, normal is run, and probably hangs (it did for me).
Then if you reboot enough times it goes to fallback as long as the failover code isn't broken.
What _should_ happen (in my opinion) is that we use the CMOS checksum, and have the same code use a static field (somewhere in ROM) as a backup if the CMOS values can't be trusted.
That could work by extending the cmos.layout file with defaults that are written to some static C array, which is linked in. get_option (I think that's what the CMOS access code is called) switches between CMOS and this table based on the CMOS checksum. This is more stable than the current approach and gets rid of a second binary
This seems orthogonal to me, but it seems like a good idea.
(probably with different behaviour beyond the config values!)
By design, right?
What other use cases for fallback/normal are there that must be accounted for?
I don't know. Fallback only is definitely simpler.
In general removing features is easier than adding them. Will this make
it
easier to add it later?
The change can be reverted with little effort: set CONFIG_HAVE_FAILOVER_BOOT to 1, add failover to the Config-abuild.lb again, and it works as it does now. I didn't remove the #ifs for FAILOVER etc. to easily figure out what parts of the code are "early setup" and what's later code, so we can reuse it, should we want to.
No problem. I think this is the right time to have this discussion.
Thanks, Myles