Carl-Daniel Hailfinger wrote:
Hi Marc,
I'm currently working to unify K8 and Fam10 CAR to use the same code at runtime (as opposed to buildtime #ifdefs). While this may not be a goal for v2, I definitely want to try it for v3.
A few questions/comments about the CAR code:
- Only Fam10 APs are treated specially. APs of older generations seem to
be unhandled. Did older generations treat each core as BSP (code seems to suggest that) or were there other special provisions?
I don't know. I haven't used or worked on that code. YH would be the better person to ask. For the fam10 code there are some settings that can only be set from the AP cores.
- "Errata 193: Disable clean copybacks to L3 cache to allow cached ROM."
Erratum 193 seems to be unlisted in public data sheets. If it is the famous L3 problem, we might want to enable the workaround only on affected revisions.
This is an errata for early silicon which is why it isn't in the public rev guide. It is a fix for caching instructions while in CAR mode. It can be removed. All Ax support could be removed.
- CAR goes from 0xC8000 to 0xCFFFF. Assuming GlobalVarSize=0 (untrue,
but easier to calculate), BSP stack will be from 0xCC000 to 0xCFFFF and AP stacks will be below 0xCBFFF.
- With the current settings (32k CAR total, 1k per AP, 16K for the BSP)
the scheme will fall apart if the highest NodeID shifted by the number of CoreID bits is 16 or higher. The BKDG indicates that the number of CoreID bits is 2, so a NodeID of 4 or higher will break.
Yes. This was sufficient for the K8 and was not changed when I added fam10. 8 dual core K8 was the most you could have. It could probably be expended into the rest of the shadow hole (up to FFFFF) if needed. The reason to keep it in the hole is for memory eye finding that will happen from 1MB to TOM.
- There is no good place to store the printk() buffer in CAR. On Geode
and i586, the printk buffer runs from the lowest address of the CAR area to the middle. Keeping that design will result in the AP stacks colliding with the printk buffer. Limiting the size of the printk buffer dynamically would work unless there are more than 15 cores in the system, where even a printk buffer of zero size would clobber one AP stack. The other alternative is to keep the printk buffer size fixed and let the AP stacks eat into BSP stack space.
This was the problem I mentioned when you were doing the printk() buffer. You are not guaranteed the use of the cache.
- Is there any reason on any K8 or later processor supported by the
current CAR code not to use 64k CAR?
To leave room for APs? There may have been some concern about small cache versions be introduced?
- Is 1k enough stack for the APs, given some stack-heavy functions in v3?
I don't know for sure but I would expect it to be ok.
- Can the K8 processors work reliably with 0x1e1e1e1e settings in the
fixed MTRR or can the Fam10 processors work with 0x06060606?
No.
Marc