Hello Duncan Laurie, mturney mturney,
I'd like you to do a code review. Please visit
https://review.coreboot.org/c/coreboot/+/40895
to review the following change.
Change subject: libpayload: xhci: Do not memcpy registers ......................................................................
libpayload: xhci: Do not memcpy registers
memcpy() is meant to be used on normal memory and often implemented with architecture-specific optimizations to make that as performant as possible. MMIO registers often have special access restrictions that may be incompatible with whatever memcpy() does. For example, on arm64 it uses the LDP (load pair) to load 16 bytes at a time, which makes 4-byte MMIO registers unhappy.
This patch removes the caching of the XHCI capreg registers and changes it back to a pointer. The CAP_GET() macro is still accessing a full (non-bitfield) uint32_t at the end so this should still generate a 4-byte access (which was the goal of the original change in CB:39838).
Signed-off-by: Julius Werner jwerner@chromium.org Change-Id: Id058c8813087a8e8cb85f570399e07fb8a597108 --- M payloads/libpayload/drivers/usb/xhci.c M payloads/libpayload/drivers/usb/xhci_private.h 2 files changed, 7 insertions(+), 7 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/95/40895/1
diff --git a/payloads/libpayload/drivers/usb/xhci.c b/payloads/libpayload/drivers/usb/xhci.c index 2f61f86..ef1d73f 100644 --- a/payloads/libpayload/drivers/usb/xhci.c +++ b/payloads/libpayload/drivers/usb/xhci.c @@ -185,15 +185,15 @@ goto _free_xhci; }
- memcpy(&xhci->capreg, phys_to_virt(physical_bar), sizeof(xhci->capreg)); + xhci->capreg = phys_to_virt(physical_bar) + sizeof(xhci->capreg); xhci->opreg = phys_to_virt(physical_bar) + CAP_GET(CAPLEN, xhci->capreg); - xhci->hcrreg = phys_to_virt(physical_bar) + xhci->capreg.rtsoff; - xhci->dbreg = phys_to_virt(physical_bar) + xhci->capreg.dboff; + xhci->hcrreg = phys_to_virt(physical_bar) + xhci->capreg->rtsoff; + xhci->dbreg = phys_to_virt(physical_bar) + xhci->capreg->dboff;
xhci_debug("regbase: 0x%"PRIx32"\n", physical_bar); xhci_debug("caplen: 0x%"PRIx32"\n", CAP_GET(CAPLEN, xhci->capreg)); - xhci_debug("rtsoff: 0x%"PRIx32"\n", xhci->capreg.rtsoff); - xhci_debug("dboff: 0x%"PRIx32"\n", xhci->capreg.dboff); + xhci_debug("rtsoff: 0x%"PRIx32"\n", xhci->capreg->rtsoff); + xhci_debug("dboff: 0x%"PRIx32"\n", xhci->capreg->dboff);
xhci_debug("hciversion: %"PRIx8".%"PRIx8"\n", CAP_GET(CAPVER_HI, xhci->capreg), CAP_GET(CAPVER_LO, xhci->capreg)); diff --git a/payloads/libpayload/drivers/usb/xhci_private.h b/payloads/libpayload/drivers/usb/xhci_private.h index 65c3fdd..0264f1f 100644 --- a/payloads/libpayload/drivers/usb/xhci_private.h +++ b/payloads/libpayload/drivers/usb/xhci_private.h @@ -364,7 +364,7 @@ #define CAP_CSZ_LEN 1
#define CAP_MASK(tok) MASK(CAP_##tok##_START, CAP_##tok##_LEN) -#define CAP_GET(tok, cap) (((cap).CAP_##tok##_FIELD & CAP_MASK(tok)) \ +#define CAP_GET(tok, cap) (((cap)->CAP_##tok##_FIELD & CAP_MASK(tok)) \ >> CAP_##tok##_START)
#define CTXSIZE(xhci) (CAP_GET(CSZ, (xhci)->capreg) ? 64 : 32) @@ -378,7 +378,7 @@ u32 hccparams; u32 dboff; u32 rtsoff; - } __packed capreg; + } __packed *capreg;
/* opreg is R/W is most places, so volatile access is necessary. volatile means that the compiler seeks byte writes if possible,
build bot (Jenkins) has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/40895 )
Change subject: libpayload: xhci: Do not memcpy registers ......................................................................
Patch Set 1:
(1 comment)
https://review.coreboot.org/c/coreboot/+/40895/1/payloads/libpayload/drivers... File payloads/libpayload/drivers/usb/xhci_private.h:
https://review.coreboot.org/c/coreboot/+/40895/1/payloads/libpayload/drivers... PS1, Line 381: } __packed *capreg; need consistent spacing around '*' (ctx:WxV)
Julius Werner has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/40895 )
Change subject: libpayload: xhci: Do not memcpy registers ......................................................................
Patch Set 1:
Duncan, I'm somewhat confused why you both removed the bitfields and changed xhci_t to cache capreg in memory in CB:39838? Seems like those are two different solutions to the same problem, either of which should have been enough? I'm taking the caching back out here, I think that should still be good enough for what you were trying to fix, but please let me know if there's something I misunderstand about the problem and this needs a different solution.
Duncan Laurie has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/40895 )
Change subject: libpayload: xhci: Do not memcpy registers ......................................................................
Patch Set 1: Code-Review+2
Sorry about that, I did start with a memcpy approach but I thought I rejected it for the same reason that it shouldn't really be used with real mmio. I must have messed up and not reverted that before starting on the other approach.
Julius Werner has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/40895 )
Change subject: libpayload: xhci: Do not memcpy registers ......................................................................
Patch Set 1:
Hope it's okay if I push this before 24 hours since it's a pretty big bug on all Arm systems (and we're trying to rebase a big patch train on top of it).
Julius Werner has submitted this change. ( https://review.coreboot.org/c/coreboot/+/40895 )
Change subject: libpayload: xhci: Do not memcpy registers ......................................................................
libpayload: xhci: Do not memcpy registers
memcpy() is meant to be used on normal memory and often implemented with architecture-specific optimizations to make that as performant as possible. MMIO registers often have special access restrictions that may be incompatible with whatever memcpy() does. For example, on arm64 it uses the LDP (load pair) to load 16 bytes at a time, which makes 4-byte MMIO registers unhappy.
This patch removes the caching of the XHCI capreg registers and changes it back to a pointer. The CAP_GET() macro is still accessing a full (non-bitfield) uint32_t at the end so this should still generate a 4-byte access (which was the goal of the original change in CB:39838).
Signed-off-by: Julius Werner jwerner@chromium.org Change-Id: Id058c8813087a8e8cb85f570399e07fb8a597108 Reviewed-on: https://review.coreboot.org/c/coreboot/+/40895 Tested-by: build bot (Jenkins) no-reply@coreboot.org Reviewed-by: Duncan Laurie dlaurie@chromium.org --- M payloads/libpayload/drivers/usb/xhci.c M payloads/libpayload/drivers/usb/xhci_private.h 2 files changed, 7 insertions(+), 7 deletions(-)
Approvals: build bot (Jenkins): Verified Duncan Laurie: Looks good to me, approved
diff --git a/payloads/libpayload/drivers/usb/xhci.c b/payloads/libpayload/drivers/usb/xhci.c index 2f61f86..ef1d73f 100644 --- a/payloads/libpayload/drivers/usb/xhci.c +++ b/payloads/libpayload/drivers/usb/xhci.c @@ -185,15 +185,15 @@ goto _free_xhci; }
- memcpy(&xhci->capreg, phys_to_virt(physical_bar), sizeof(xhci->capreg)); + xhci->capreg = phys_to_virt(physical_bar) + sizeof(xhci->capreg); xhci->opreg = phys_to_virt(physical_bar) + CAP_GET(CAPLEN, xhci->capreg); - xhci->hcrreg = phys_to_virt(physical_bar) + xhci->capreg.rtsoff; - xhci->dbreg = phys_to_virt(physical_bar) + xhci->capreg.dboff; + xhci->hcrreg = phys_to_virt(physical_bar) + xhci->capreg->rtsoff; + xhci->dbreg = phys_to_virt(physical_bar) + xhci->capreg->dboff;
xhci_debug("regbase: 0x%"PRIx32"\n", physical_bar); xhci_debug("caplen: 0x%"PRIx32"\n", CAP_GET(CAPLEN, xhci->capreg)); - xhci_debug("rtsoff: 0x%"PRIx32"\n", xhci->capreg.rtsoff); - xhci_debug("dboff: 0x%"PRIx32"\n", xhci->capreg.dboff); + xhci_debug("rtsoff: 0x%"PRIx32"\n", xhci->capreg->rtsoff); + xhci_debug("dboff: 0x%"PRIx32"\n", xhci->capreg->dboff);
xhci_debug("hciversion: %"PRIx8".%"PRIx8"\n", CAP_GET(CAPVER_HI, xhci->capreg), CAP_GET(CAPVER_LO, xhci->capreg)); diff --git a/payloads/libpayload/drivers/usb/xhci_private.h b/payloads/libpayload/drivers/usb/xhci_private.h index 65c3fdd..0264f1f 100644 --- a/payloads/libpayload/drivers/usb/xhci_private.h +++ b/payloads/libpayload/drivers/usb/xhci_private.h @@ -364,7 +364,7 @@ #define CAP_CSZ_LEN 1
#define CAP_MASK(tok) MASK(CAP_##tok##_START, CAP_##tok##_LEN) -#define CAP_GET(tok, cap) (((cap).CAP_##tok##_FIELD & CAP_MASK(tok)) \ +#define CAP_GET(tok, cap) (((cap)->CAP_##tok##_FIELD & CAP_MASK(tok)) \ >> CAP_##tok##_START)
#define CTXSIZE(xhci) (CAP_GET(CSZ, (xhci)->capreg) ? 64 : 32) @@ -378,7 +378,7 @@ u32 hccparams; u32 dboff; u32 rtsoff; - } __packed capreg; + } __packed *capreg;
/* opreg is R/W is most places, so volatile access is necessary. volatile means that the compiler seeks byte writes if possible,
Alex Thiessen has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/40895 )
Change subject: libpayload: xhci: Do not memcpy registers ......................................................................
Patch Set 2:
Even though it's already merged, my two cents:
The underlying issue here seems that `phys_to_virt()` returns a non-volatile pointer to things which are volatile. It should probably be split into `phys_mem_to_virt()` (same as `phys_to_virt()` today) and `phys_dev_to_virt()` (returning a volatile pointer) to catch such bugs in build, not in development or even production. A compiler warning would then be triggered, like "passing argument 2 of ‘memcpy’ discards ‘volatile’ qualifier from pointer target type [-Wdiscarded-qualifiers]"
9elements QA has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/40895 )
Change subject: libpayload: xhci: Do not memcpy registers ......................................................................
Patch Set 2:
Automatic boot test returned (PASS/FAIL/TOTAL): 4/0/4 Emulation targets: "QEMU x86 q35/ich9" using payload TianoCore : SUCCESS : https://lava.9esec.io/r/2914 "QEMU x86 q35/ich9" using payload SeaBIOS : SUCCESS : https://lava.9esec.io/r/2913 "QEMU x86 i440fx/piix4" using payload SeaBIOS : SUCCESS : https://lava.9esec.io/r/2912 "QEMU AArch64" using payload LinuxBoot_u-root_kexec : SUCCESS : https://lava.9esec.io/r/2911
Please note: This test is under development and might not be accurate at all!
Julius Werner has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/40895 )
Change subject: libpayload: xhci: Do not memcpy registers ......................................................................
Patch Set 2:
The underlying issue here seems that `phys_to_virt()` returns a non-volatile pointer to things which are volatile. It should probably be split into `phys_mem_to_virt()` (same as `phys_to_virt()` today) and `phys_dev_to_virt()` (returning a volatile pointer) to catch such bugs in build, not in development or even production. A compiler warning would then be triggered, like "passing argument 2 of ‘memcpy’ discards ‘volatile’ qualifier from pointer target type [-Wdiscarded-qualifiers]"
I'm not sure if we're really using volatile in that capacity enough to make that useful. The USB stack is kind of a special case -- most other places where we do MMIO instead use the read32()/write32() accessors which cast to volatile internally. Really, for consistency we should probably rewrite the USB stuff to use those APIs as well. But it has worked for us so far so there is little pressure to invest a lot of effort into changing it.
Julius Werner has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/40895 )
Change subject: libpayload: xhci: Do not memcpy registers ......................................................................
Patch Set 2:
Sorry, I screwed up... this patch broke XHCI completely (although it no longer crashes on Arm). Fixed in CB:40956