Change in coreboot[master]: soc/intel/common/basecode: Implement CSE update flow - coreboot-gerrit

21 Mar 2020


      Furquan Shaikh has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/35403 )
Change subject: soc/intel/common/basecode: Implement CSE update flow
......................................................................
Patch Set 67:
(46 comments)
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
File src/soc/intel/common/basecode/fw_update/Kconfig:
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 1: INTEL_CSE_UPDATE
I don't think this Kconfig is required. We can basically use existence of ME_RW_FILE as an indication that firmware update is requested.
i.e. if ME_RW_FILE is empty string, then nothing would be added to CBFS. If CBFS does not contain a RW binary then no need to perform RW update.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 6: ADD_ME_RW_BINARY
Same here. This config will not be required.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 13: ME_RW_VER_CBFS
I have no clue what this config means from its name or string description.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 18: ME_RW_FILE
In order to keep consistent naming, I think:
a) These configs should be added to soc/intel/common/block/cse/Kconfig
b) These should be named SOC_INTEL_CSE_RW_FILEPATH and SOC_INTEL_CSE_RW_CBFS_FILENAME. First one would basically be the path to CBFS RW binary file which will be picked up by the build system. Second would be the name that is used when adding this binary to CBFS.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
File src/soc/intel/common/basecode/fw_update/cse_update.c:
PS67: 
What are the reasons behind adding this to basecode instead of block/cse?
In my opinion, basecode/ should contain all the support that is not tied to 1 particular IP block. Here, all the firmware update support that is being added is for CSME IP block. Thus, it would be better to put this under common/block/cse/custom_sku.c since it is applicable only to custom sku.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 33: #define CBFS_SI_ME	"SI_ME"
This is making an assumption that the FMAP name is always the same. I think we should add a Kconfig option for this SOC_INTEL_CSE_FMAP_NAME which defaults to "SI_ME" but can be overriden by mainboard if required.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 36: #define ME_RW_VERSION_SZ	16
Humm... I am working on some proposals to see if we can do this differently. For now, I think it is okay.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 41: #define ME_BP_SIGN_SIZE	4
What exactly does this mean?
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 43: /* ME RW's signature is valid */
             : #define ME_RW_SIGN_VALID	0
             : /* ME RW's Signature is not valid */
             : #define ME_RW_SIGN_INVALID	1
             : 
Are these flags in some field?
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 47: 
             : /* ME RW's status is valid */
             : #define ME_RW_STATE_VALID	0
             : /* ME RW's status is not valid */
             : #define ME_RW_STATE_INVALID	1
Same here? Some context in comments would be helpful.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 68: CONFIG_ME_RW_VER_CBFS
Is this used for storing the CSE RW version as a separate file within CBFS? My understanding based on design doc and the CL to stitch things together was that the RW version was being simply attached to the RW binary at the start. Reference: crrev.com/i/2748773.
.... Okay after having read the function below where this is used, I think this is basically just the same ME RW binary from CBFS. This is a very confusing name. Please see my comments below where get_cbfs_me_rw_ver() is called.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 87: Master
What does Master mean here?
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 88: me_region
This is a region_device and not simply a region. I don't think you intend to write to the entire me region here. It would be better to:
struct region cse_region;
if (fmap_locate_area(CONFIG_SOC_INTEL_CSE_FMAP_NAME, &cse_region) < 0) {
    ...
}
struct region cse_rw_region;
if (cse_get_rw_region(&cse_rw_region) < 0) {
    ...
}
if (region_is_subregion(&cse_rw_region, &cse_region) == 0) {
    ...
}
And then get the region device for cse_rw_region. You don't need to do any math in this function then.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 98: bp2_start_offset + region_device_offset(&me_region);
This looks wrong. Shouldn't it be bp2_start_offset - region_device_offset(&me_region);?
Anyways, this math won't be required. See comment above.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 99: bp2_end_offset + 1
This doesn't look right. It should be end_offset - start_offset + 1;
Anyways, this math won't be required. See comment above.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 122: eturn -1;
This is a bigger problem. If CSE is not able to jump to RO, we probably need to do this by talking to EC or H1. Also, some kind of state needs to be maintained to ensure that we don't get stuck doing reset by H1 if we are unable to talk to CSE.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 130: heci_reset
Why is a heci_reset() required here?
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 138: hfs1.data = me_read_config32(PCI_ME_HFSTS1);
              : 	printk(BIOS_DEBUG, "me_fwu: ME HFSTS1=0x%x\n", hfs1.data);
What is the purpose of reading and printing HFSTS1 here and after jumping to RO?
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 141: cse_get_current_boot_partition
There is no need to read it here. It should just use cse_bp_info and pass that along to other functions.
i.e.
if (cse_boot_to_ro(cse_bp_info) < 0)
     return -1;
...
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 146: (current_bp != BP1)
This check can be moved to cse_boot_to_ro(). It can check if current partition is RO and return early. Makes it easier to read and follow the flow here.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 159: cse_hmrfpo_get_status
Why is the status checked again? This already happens as part of cse_hmrfpo_enable(). It returns 1 only if HMRFPO is enabled.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 178: cse_get_bp_entry_version
Add helper for cse_get_rw_version() and that can call cse_get_bp_entry_version() for BP2.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 213: uint32_t
size_t
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 222: 	uint8_t *cbfs_me_rw_wo_sign = (uint8_t *)cbfs_me_rw + ME_BP_SIGN_SIZE;
Why is the signature split and updated separately? Since the partition is erased, CSE-RW would not pass integrity check at boot up and RW partition wouldn't boot if it is corrupted. Is my understanding wrong here?
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 258: if (rdev_readat(me_rw_part, (void *) &me_bp_sign, 0, ME_BP_SIGN_SIZE)
              : 			!= ME_BP_SIGN_SIZE) 
Why is this check required? If last update was incomplete, what does the CSE report for Boot Partition Status ID -- Isn't it "General Failure"? Can't we just check what the status id is in that case?
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 277: me_rw_part
The naming of variables is kind of confusing overall here. How about using:
read_rdev and write_rdev
read_rdev : Region device corresponding to the CBFS RW file (This is where the CBFS RW binary with version prepended to it will be read from)
write_rdev : Region device corresponding to CSE RW partition in FMAP (This is where the new CSE RW from CBFS will be written to)
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 285: CONFIG_GBB_FLAG_DISABLE_CSE_FW_SYNC
There are no details about this. Can you please raise a separate bug for this?
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 291: 
I have mentioned this on cse.c as well. Just copying here for simplicity to compare:
This function would then be your entry point into the entire CSE FW sync and update path i.e.
a) Ensure this is Custom SKU
b) Get boot partition info.
c) Check boot mode. If recovery, ensure boot partition is RO. Else, switch to RO.
d) In non recovery mode, ensure boot partition for RW is present i.e. status id is not 2. Else, trigger recovery.
e) Check if RW update is required. It can be based on two things:
   (i) Is status ID for RW partition set to 1?
   (ii) If status ID for RW partition is 0, and RW version in boot info does not match RW version in CBFS.
   If either of the above is yes, update is required.
f) call cse_update_rw(): 
   (i) Prepare for update: Switch to RO if required, HMRFPO
   (ii) Perform actual write to SPI flash
g) If e) or f) fails, trigger recovery.
h) If all good, jump to RW if required. Else continue booting.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 292: 	int rec_mode = vboot_recovery_mode_enabled();
At this point, the function should call cse_get_bp_info(&cse_bp_info);
Irrespective of what path the function follows from here on, it is going to need cse_bp_info.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 294: if (rec_mode) {
              : 		printk(BIOS_DEBUG, "me_fwu: Skip FW update in the recovery path\n");
              : 		return;
              : 	}
If it is recovery mode, then this function should ensure that it jumps to RO.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 304: /*
              : 	 * TODO:
              : 	 *  Check for data mismatch through boot partition status, then
              : 	 *	- send DATA_CLEAR command to CSE
              : 	 *	- Switch to RW(BP2)
              : 	 *	- Issue Global Reset
              : 	 */
Please raise bugs for any TODOs that you plan to address later.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 312: get_me_rw_part
For consistency can all functions be named with _cse_ instead of _me_.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 317: if (get_cbfs_me_rw_ver(&me_rw_ver_cbfs) < 0) {
              : 		printk(BIOS_ERR, "me_fwu: Failed to get cbfs ME for update\n");
              : 		goto failed;
              : 	}
              : 
I am very confused why version is being looked for as a separate file in CBFS.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 330: uint32_t
size_t
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 332: rdev_mmap
Okay after getting to this point, I realize that it is the same RW binary in CBFS and not two separate blobs for RW binary and for FW version. The naming is very confusing "me_rw_ver_cbfs". It seems as if it is pointing to version file in CBFS. Can we organize this slightly different to make it easier to follow the flow?
static int cse_get_read_rdev(struct region_device *rdev)
{
...
-------struct cbfsf file;
...
-------/* Skip update if no CBFS RW binary */                                                 
-------if (cbfs_boot_locate(&file, CONFIG_SOC_INTEL_CSE_RW_CBFS_NAME, NULL))                  
------->-------return -1;
...
-------cbfs_file_data(&rdev, &file);                                                          
-------return 0;
}
static int cse_update_rw(struct cse_boot_partition *cse_bp_info)
{
...
-------struct region_device read_rdev, write_rdev;
-------void *cse_cbfs_rw;
...
-------if (cse_get_read_rdev(&read_rdev) < 0)
------->-------return 0;
...
-------cse_cbfs_rw = rdev_mmap_full(&read_rdev);
...
-------if (!cse_update_required(cse_bp_info, cse_cbfs_rw, region_device_sz(&read_rdev))) {
------->-------rdev_munmap(&read_rdev, cse_cbfs_rw);
------->-------return 0;
-------}
...
-------if (cse_get_write_rdev(&write_rdev) < 0) {
------->-------rdev_munmap(&read_rdev, cse_cbfs_rw);
------->-------return -1;
-------}
...
-------ret = cse_write_to_rw(&write_dev, cse_cbfs_rw, region_device_sz(&read_rdev));
-------rdev_munmap(&read_rdev, cse_cbfs_rw);
-------return ret;
}
void cse_fw_sync(void)
{
...
-------struct cse_boot_partition_info cse_bp_info;
...
-------if (!cse_is_hfs3_fw_sku_custom()) {
------->-------// Print error
-------}
...
-------if (cse_get_bp_info(&cse_bp_info) < 0) {
------->-------// Recovery Mode -- How do we want to handle this?
------->-------// Non-recovery mode -- trigger recovery
-------}
...
-------if (!cse_is_rw_info_valid(&cse_bp_info)) {
------->-------// Trigger recovery
-------}
...
-------if (vboot_recovery_mode_enabled()) {
------->-------cse_jump_to_ro(&cse_bp_info);
------->-------return;
-------}
...
-------if (cse_update_rw(&cse_bp_info) < 0) {
------->-------// Trigger recovery
-------}
...
-------if (cse_switch_to_rw(&cse_bp_info) < 0)
------->-------// Trigger recovery
-------}
...
-------// If we are here without any failure, it means that we have successfully jumped to
-------// CSE-RW. Continue booting.
}
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 335: if (cbfs_me_rw_sz > region_device_sz(&me_rw_part)) {
              : 		printk(BIOS_ERR,
              : 		"me_fwu: cbfs me_rw size(0x%x) is more than me_rw partition size(0x%x),"
              : 		"skip update\n", cbfs_me_rw_sz, (uint32_t)region_device_sz(&me_rw_part));
              : 		goto failed;
              : 	}
This will be checked by rdev_writeat using write_rdev. There is no need for an explicit check here.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 346: if (!cse_check_rw_status(&me_rw_part_status)) {
              : 		printk(BIOS_ERR, "me_fwu: me rw status check fail\n");
              : 		goto failed;
              : 	}
I think this check should happen initially as part of cse_get_boot_info() or maybe after that call so that the flow just drops into recovery mode right away.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
PS67, Line 395: printk(BIOS_ERR, "me_fwu: Failed %s\n", __func__);
              : 	do_global_reset();
No where in this path I see recovery being triggered. This is something that will have to be handled correctly.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/basec... 
File src/soc/intel/common/basecode/include/intelbasecode/cse_update.h:
PS67: 
This file won't be required at all. See my comments on cse.c
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/block... 
File src/soc/intel/common/block/cse/cse.c:
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/block... 
PS67, Line 866: cse_enable_fw_sku_custom
As mentioned here: https://review.coreboot.org/c/coreboot/+/39225/4/src/soc/intel/common/block/..., I think this should be organized differently:
1. Move this function definition to custom_sku.c which is added only when SOC_INTEL_CSE_CUSTOM_SKU is selected by mainboard.
2. Move all functions in cse_bp.c to custom_sku.c. Those APIs are really just supported by custom SKU.
3. Rename this function to cse_fw_sync()
4. There is no need to maintain a file-scoped variable(cse_bp_info) in cse_bp.c for boot partition info. This is the only function that really cares about it. So, add `static struct cse_boot_partition_info cse_bp_info` to cse_fw_sync(). And then pass in cse_bp_info to cse_get_boot_partition_info() to read partition info from CSE. You can then pass in that structure to whatever helper function is being used.
This function would then be your entry point into the entire CSE FW sync and update path i.e.
a) Ensure this is Custom SKU
b) Get boot partition info.
c) Check boot mode. If recovery, ensure boot partition is RO. Else, switch to RO.
d) In non recovery mode, ensure boot partition for RW is present i.e. status id is not 2. Else, trigger recovery.
e) Check if RW update is required. It can be based on two things:
   (i) Is status ID for RW partition set to 1?
   (ii) If status ID for RW partition is 0, and RW version in boot info does not match RW version in CBFS.
   If either of the above is yes, update is required.
f) call cse_update_rw(): 
   (i) Prepare for update: Switch to RO if required, HMRFPO
   (ii) Perform actual write to SPI flash
g) If e) or f) fails, trigger recovery.
h) If all good, jump to RW if required. Else continue booting.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/block... 
PS67, Line 868: enum boot_partition_id current_bp;
I would recommend maintaining `static struct cse_boot_partition_info cse_bp_info;` here and passing that into each helper function as required. There is no need to maintain a file-scoped variable for boot partition info.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/block... 
PS67, Line 872: BIOS_DEBUG
Now, this will be an error since custom_sku.c should really be included only for boards that have explicitly selected SOC_INTEL_CSE_CUSTOM_SKU.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/block... 
PS67, Line 885: * TODO: Check to move this to bootblock */
It is not possible to move to bootblock. Decision to go down recovery or non-recovery path happens as part of verstage. In general, I want to have a separate discussion to be able to move this much earlier -- both check and update paths. But, let's discuss that separately. I will raise a bug for that.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/block... 
PS67, Line 887: rec_mode && current_bp != BP1
I mentioned this on https://review.coreboot.org/c/coreboot/+/35402/71/src/soc/intel/common/block....
Instead of using BP1/BP2, I think using RO v/s RW would make the flow much easier to follow.
if (rec_mode) {
    if (!cse_in_ro(&cse_bp_info)) {
        /* This results in a global reset, so control should never return. */
        cse_switch_to_ro();
        printk(BIOS_ERR, "Switch to RO unsuccessful!\n");
        /* TODO: Should there be a request to EC or H1 to perform a cold reset? */
    }
    return;
}
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/block... 
PS67, Line 893: #if !CONFIG(INTEL_CSE_UPDATE)
There is no need to guard this or have a separate call just for updating RW. It should be done as part of this same function.  See cse_update.c for related comments.
https://review.coreboot.org/c/coreboot/+/35403/67/src/soc/intel/common/block... 
PS67, Line 908: failed:
              : 	printk(BIOS_ERR, "HECI: Failed to enable CSE Firmware Custom SKU\n");
              : 	do_global_reset();
In case of failures, we shouldn't be just doing a global reset. This could end up in a reset loop especially if that is triggered in recovery mode. It is important to handle fail conditions differently for recovery and non-recovery paths.
Also, in case of failure in non-recovery mode, we should basically mark a failure and switch to recovery mode.
-- 
To view, visit https://review.coreboot.org/c/coreboot/+/35403
To unsubscribe, or for help writing mail filters, visit https://review.coreboot.org/settings

Gerrit-Project: coreboot
Gerrit-Branch: master
Gerrit-Change-Id: I12f6bba3324069d65edabaccd234006b0840e700
Gerrit-Change-Number: 35403
Gerrit-PatchSet: 67
Gerrit-Owner: Sridhar Siricilla sridhar.siricilla@intel.com
Gerrit-Reviewer: Aamir Bohra aamir.bohra@intel.com
Gerrit-Reviewer: Andrey Petrov anpetrov@fb.com
Gerrit-Reviewer: Balaji Manigandan balaji.manigandan@intel.com
Gerrit-Reviewer: Furquan Shaikh furquan@google.com
Gerrit-Reviewer: Jamie Ryu jamie.m.ryu@intel.com
Gerrit-Reviewer: Martin Roth martinroth@google.com
Gerrit-Reviewer: Patrick Georgi pgeorgi@google.com
Gerrit-Reviewer: Patrick Rudolph siro@das-labor.org
Gerrit-Reviewer: Rizwan Qureshi rizwan.qureshi@intel.com
Gerrit-Reviewer: Sridhar Siricilla sridhar.siricilla@intel.com
Gerrit-Reviewer: Sridhar Siricilla sridhar.siricilla@intel.corp-partner.google.com
Gerrit-Reviewer: Subrata Banik subrata.banik@intel.com
Gerrit-Reviewer: V Sowmya v.sowmya@intel.com
Gerrit-Reviewer: build bot (Jenkins) no-reply@coreboot.org
Gerrit-CC: Maxim Polyakov max.senia.poliak@gmail.com
Gerrit-CC: Patrick Rudolph patrick.rudolph@9elements.com
Gerrit-CC: Paul Menzel paulepanter@users.sourceforge.net
Gerrit-Comment-Date: Sat, 21 Mar 2020 06:02:42 +0000
Gerrit-HasComments: Yes
Gerrit-Has-Labels: No
Gerrit-MessageType: comment