Julius Werner has uploaded this change for review. ( https://review.coreboot.org/c/coreboot/+/80304?usp=email )
Change subject: commonlib: Add generic word-at-a-time optimization to ipchksum() ......................................................................
commonlib: Add generic word-at-a-time optimization to ipchksum()
This patch adds a generic optimization to calculate a machine-word-sized "wide sum" for the ipchksum() algorithm. This is often not as efficient as handcrafted assembly (about half as fast on arm64 and x86_32, about the same speed on x86_64), but likely still much better than nothing on architectures that we don't have handcrafted assembly for.
Change-Id: I8f0fe117e2788d1b6801b73824b97e1e31ecc694 Signed-off-by: Julius Werner jwerner@chromium.org --- M src/commonlib/bsd/ipchksum.c 1 file changed, 13 insertions(+), 2 deletions(-)
git pull ssh://review.coreboot.org:29418/coreboot refs/changes/04/80304/1
diff --git a/src/commonlib/bsd/ipchksum.c b/src/commonlib/bsd/ipchksum.c index b7434e5..8e0aa9b 100644 --- a/src/commonlib/bsd/ipchksum.c +++ b/src/commonlib/bsd/ipchksum.c @@ -34,7 +34,7 @@ :: "cc" ); } -#elif defined(__i386__) || defined(__x86_64__) +#elif defined(__i386__) || defined(__x86_64__) /* __aarch64__ */ size_t size8 = size / 8; const uint64_t *p8 = data; i = size8 * 8; @@ -57,7 +57,18 @@ [size8] "+c" (size8) /* put size in ECX so we can JECXZ */ :: "cc" ); -#endif /* __i386__ || __x86_64__ */ +#else /* __i386__ || __x86_64__ */ + size_t aligned_size = ALIGN_DOWN(size, sizeof(unsigned long)); + const unsigned long *p_long = data; + for (; i < aligned_size; i += sizeof(unsigned long)) { + unsigned long new = wide_sum + *p_long++; + /* Overflow check to emulate a manual "add with carry" in C. The compiler seems + to be clever enough to find ways to elide the branch on most archs. */ + if (new < wide_sum) + new++; + wide_sum = new; + } +#endif
while (wide_sum) { sum += wide_sum & 0xFFFF;