Attention is currently required from: Eric Lai, Jérémy Compostella, Kapil Porwal, Pranava Y N.
Subrata Banik has posted comments on this change by Subrata Banik. ( https://review.coreboot.org/c/coreboot/+/84236?usp=email )
Change subject: soc/intel/cmn/block/cpu: Simplify calculation of non-eviction ways ......................................................................
Patch Set 3:
(1 comment)
File src/soc/intel/common/block/cpu/car/cache_as_ram.S:
https://review.coreboot.org/c/coreboot/+/84236/comment/efc32214_a14b8369?usp... : PS1, Line 519: add %edx, %eax
Seems better, did you run a debugger to verify that the registers through this logic ?
I don't have HW debugger to run it. But used software analyzer to ensure the register values are correct.
Now we 3 instructions instead of 3 with an extra label. I had a version like this originally that I discarded as I preferred to not introduce a new label. I let you decide if you want to proceed with this. I don't see any sinificant improvement with this new version but I am not opposed to it.
Snippet 1
``` movl $0x01, %ecx ; Move the value 1 into the ECX register cmp $0x00, %edx ; Compare the value in the EDX register with 0 cmovne %ecx, %edx ; If EDX is not equal to 0, move the value in ECX (which is 1) into EDX add %edx, %eax ; Add the value in EDX to the value in EAX ```
Snippet 2
``` testl %edx, %edx ; Perform a bitwise AND on EDX with itself, setting flags based on the result jz skip_increment ; If the zero flag is set (meaning EDX was 0), jump to the skip_increment label incl %eax ; Increment the value in EAX by 1 skip_increment: ; Label to jump to if EDX was 0 ```
Both snippets essentially achieve the same goal:
- If the value in %edx is zero, don't modify %eax. - If the value in %edx is non-zero, increment %eax by 1.
However, the second snippet (testl/jz/incl) is generally considered to be better optimized for a few reasons:
Fewer Instructions: It uses three instructions compared to the four in the first snippet. Fewer instructions generally lead to faster execution.
No Data Movement: The second snippet doesn't need to move any immediate values (like the 0x01) into registers, which can save some execution time.
Leverages Flags: The testl instruction efficiently sets flags based on the value in %edx, and the jz instruction directly uses those flags for conditional branching. This can be more streamlined than the cmp/cmovne approach. Additionally, cmovne is complex instruction which have potential for pipeline stalls.