Hi coreboot folks,
I have recently stumbled upon an issue that non-ChromeOS platforms, once entered into recovery mode, cannot leave this state, despite the RW partition being updated with correctly signed firmware copy. I.e. imagine situation where RW A (and B) is not valid, vboot logic causes to boot into recovery. Flash is updated with valid RW A (and B) but the vboot logic does not try to verify the RW partition, instead is stuck in recovery mode due to VBOOT NVRAM content.
For ChromeOS platform the recovery reason is cleared in vb2api_kernel_phase2 but vb2api_kernel_phase2 is probably not used anywhere except depthcharge (or whatever is loading the ChromeOS kernel). So non-ChromeOS platform using vboot have no option to get out of recovery. Unless I am missing something, then please correct me.
My suggestion would be to add vb2_clear_recovery to vb2api exposed to the coreboot and let the platform code decide when the recovery request should be cleared. Also coreboot can attempt to verify RW partition despite recovery reason, but it would probably be inefficient and lead to situations where recovery mode should be entered, but wasn't entered.
Dear ChromeOS firmware experts, your opinion is highly appreciated.
Best regards,
Yeah, we moved the point where the recovery reason gets cleared to the payload a while ago, because otherwise it got lost when the platform decides to do extra reboots. For example, some Intel SoCs do CSE sync in ramstage, and sometimes when they do that they need to do an extra reboot. When we used to clear the recovery reason in verstage, that extra reboot after that point meant we weren't in recovery mode anymore when we finally reached the payload.
I don't mind renaming vb2_clear_recovery() to vb2api_clear_recovery() and making it available in the public API. Then you can either add code somewhere to the end of coreboot to call it (but with a Kconfig that excludes ChromeOS) or link your payload to vboot and call it from there (which might be useful if your payload also has situations where it may need to do an extra reboot). Happy to review a patch to vboot if you want to send one.
In general though, I think you really need payload or OS integration if you want to have a useful recovery mode. Recovery mode usually means something is actually broken that needs to be fixed (e.g. RW partition corrupted from a bad update), and coreboot alone can't fix it. So we can clear the recovery condition wherever it works best, but you still need to implement the part that will actually fix the system somewhere.
On Mon, Apr 3, 2023 at 2:48 AM Michał Żygowski michal.zygowski@3mdeb.com wrote:
Hi coreboot folks,
I have recently stumbled upon an issue that non-ChromeOS platforms, once entered into recovery mode, cannot leave this state, despite the RW partition being updated with correctly signed firmware copy. I.e. imagine situation where RW A (and B) is not valid, vboot logic causes to boot into recovery. Flash is updated with valid RW A (and B) but the vboot logic does not try to verify the RW partition, instead is stuck in recovery mode due to VBOOT NVRAM content.
For ChromeOS platform the recovery reason is cleared in vb2api_kernel_phase2 but vb2api_kernel_phase2 is probably not used anywhere except depthcharge (or whatever is loading the ChromeOS kernel). So non-ChromeOS platform using vboot have no option to get out of recovery. Unless I am missing something, then please correct me.
My suggestion would be to add vb2_clear_recovery to vb2api exposed to the coreboot and let the platform code decide when the recovery request should be cleared. Also coreboot can attempt to verify RW partition despite recovery reason, but it would probably be inefficient and lead to situations where recovery mode should be entered, but wasn't entered.
Dear ChromeOS firmware experts, your opinion is highly appreciated.
Best regards,
Michał Żygowski Firmware Engineer GPG: 6B5BA214D21FCEB2 https://3mdeb.com | @3mdeb_com
Hi Julius,
On 6.04.2023 02:26, Julius Werner wrote:
Yeah, we moved the point where the recovery reason gets cleared to the payload a while ago, because otherwise it got lost when the platform decides to do extra reboots. For example, some Intel SoCs do CSE sync in ramstage, and sometimes when they do that they need to do an extra reboot. When we used to clear the recovery reason in verstage, that extra reboot after that point meant we weren't in recovery mode anymore when we finally reached the payload.
Exactly, the reboots are the main problem when having recovery clear in verstage.
I don't mind renaming vb2_clear_recovery() to vb2api_clear_recovery() and making it available in the public API. Then you can either add code somewhere to the end of coreboot to call it (but with a Kconfig that excludes ChromeOS) or link your payload to vboot and call it from there (which might be useful if your payload also has situations where it may need to do an extra reboot). Happy to review a patch to vboot if you want to send one.
In general though, I think you really need payload or OS integration if you want to have a useful recovery mode. Recovery mode usually means something is actually broken that needs to be fixed (e.g. RW partition corrupted from a bad update), and coreboot alone can't fix it. So we can clear the recovery condition wherever it works best, but you still need to implement the part that will actually fix the system somewhere.
Payload and OS integration is one thing. What I still didn't figure out is how to tell vboot to check if RW partition is valid? Imagine the platform is in recovery mode and I have flashed the RW with correctly signed image. vboot will not attempt to boot from RW, because of the recovery reason being non-zero. Thus the only way I see is to clear the recovery reason. How it is solved on ChromeOS systems after updating with correct RW firmware? Is there any flag (in vboot shared data/workbuf) to tell vboot to attempt RW check despite recovery?
Best regards,
Payload and OS integration is one thing. What I still didn't figure out is how to tell vboot to check if RW partition is valid? Imagine the platform is in recovery mode and I have flashed the RW with correctly signed image. vboot will not attempt to boot from RW, because of the recovery reason being non-zero. Thus the only way I see is to clear the recovery reason. How it is solved on ChromeOS systems after updating with correct RW firmware? Is there any flag (in vboot shared data/workbuf) to tell vboot to attempt RW check despite recovery?
Well, in our flow we always clear the recovery reason from the payload (as part of running through vb2api_kernel_phase2()). So if you aren't using the kernel verification part but otherwise want the equivalent of that, then the solution would be to make vb2_clear_recovery() available as a top-level API function and call it from your payload.
Alternatively, you could also just have your OS run `crossystem recovery_request=0` after it has reinstalled the RW firmware.