On 08.01.2008 01:40, Marc Jones wrote:
Carl-Daniel Hailfinger wrote:
Marc?
While the patch is against the generic x86 CAR code, it can also be easily modified to work with AMD K8/K9/K10 CAR code. Especially the recent K10 commit made AMD CAR an #ifdef mess which could be sorted out nicely.
Yes, It would be good to clean them all up.
OK, will prepare a patch to do so after we have finalized the generic x86 CAR patch.
This patch is part of my quest to clean up those v2 code parts which will someday end up in v3.
A good cause :)
Thanks.
-#if CacheSize == 0x10000 - /* enable caching for 64K using fixed mtrr */ +/* 0x06 is the WB IO type for a given 4k segment.
- segs is the number of 4k segments we want to use for CAR.
- subpart is the nth 32-bit window into IO type configuration.
- reg is the register where the IO type should be stored.
- */
+.macro extractmask segs, subpart, reg +.if \segs - (\subpart * 4) <= 0
- xorl \reg, \reg
+.elseif \segs - (\subpart * 4) == 1
- movl $0x06000000, \reg
+.elseif \segs - (\subpart * 4) == 2
- movl $0x06060000, \reg
+.elseif \segs - (\subpart * 4) == 3
- movl $0x06060600, \reg
+.elseif \segs - (\subpart * 4) >= 4
- movl $0x06060606, \reg
+.endif +.endm
Why not just pass in the number of 4k pieces, \segs - (\subpart * 4)?
To simplify understanding the formula for readers of the code. But I agree moving calculations makes the code more readable in the function above.
+.macro simplemask_helper segs, subpart +.if \subpart & 0x1
- extractmask \segs, \subpart, %eax
+.else
- extractmask \segs, \subpart, %edx
+.endif +.endm +/* size is the cache size in bytes we want to use for CAR.
- part is the nth 64-bit window into IO type configuration.
- */
+.macro simplemask size, part
- simplemask_helper (\size / 0x1000), (\part * 2 + 1)
- simplemask_helper (\size / 0x1000), (\part * 2)
+.endm
The \part stuff isn't really intuitive. I think an .if size > 32K would be better and then the caller doesn't have to know \part. Building on that the macro could fill in ecx as well.
I'll rewrite the patch a bit to make it more intuitive. I disagree with the ".if size > 32k" part, though. But once you see my new code, it may be elegant enough to not worry about the 32k boundary anymore.
Regards, Carl-Daniel