On 15.01.2008 18:54, Marc Jones wrote:
Carl-Daniel Hailfinger wrote:
The BKDG rev. 3.08 for AMD Family 0Fh states that it is possible to use a CAR area with a size of 64K in section 13.16 "Cache Initialization For General Storage During Boot". It also says that during DRAM training CAR size must be reduced. For DDR training, 256 cache lines with L1 cache tag indexes 00h-FFh are reserved and must not be used as CAR. The text then refers to the AMD64 Arch Programmers Manual Vol. 2 for more details on L1 function. However, I couldn't find any explanation why L1 cache tag indexes 00h-FFh correspond to address space C0000h-C3FFFh when fixed size MTRRs are active.
I may be misunderstanding your question but I don't think that tag indexes 00h-ffh have to correspond to C0000h-C3FFFh. I'm also not positive that they must be tag indexes 00h-ffh. I think that they could be on the end as long as the tags are contiguous.
Good to know. Can you make sure such a sentence gets added to the BKDG in its various versions?
This comment refers DDR training needing the space to hold test patterns for dqs eye finding during memory training. See northbridge\amd\amdk8\raminit_f_dqs.c TrainDQSRdWrPos().
Thanks. It seems I have to reread the code a few times to fully understand its structure. But I have spotted something peculiar in the code of TrainDQSRdWrPos() in src/northbridge/amd/amdk8/raminit_f_dqs.c
Errors = 0; channel = 0; while( (channel<2) && (!Errors)) { print_debug_dqs("\tTrainDQSRdWrPos: 1 channel ",channel, 1); for(DQSWrDelay = 0; DQSWrDelay < 48; DQSWrDelay++) { unsigned err; SetDQSDelayAllCSR(ctrl, channel, DQS_WRITEDIR, DQSWrDelay); print_debug_dqs("\t\tTrainDQSRdWrPos: 21 DQSWrDelay ", DQSWrDelay, 2); err= TrainReadDQS(ctrl, channel, pattern, buf_a, dqs_delay_a, sysinfo); print_debug_dqs("\t\tTrainDQSRdWrPos: 22 err ",err, 2); if(err == 0) break; -------------> Now we set "Errors" Errors |= err; } print_debug_dqs("\tTrainDQSRdWrPos: 3 DQSWrDelay ", DQSWrDelay, 1); if(DQSWrDelay < 48) { -------------> Now we overwrite "Errors" in case the for loop above ever had err == 0. Errors = TrainWriteDQS(ctrl, channel, pattern, buf_a, dqs_delay_a, sysinfo); print_debug_dqs("\tTrainDQSRdWrPos: 4 Errors ", Errors, 1); } channel++; if(!is_Width128){ //FIXME: 64MuxMode?? channel++; // skip channel if 64-bit mode } }
As I understand the logic of the snippet above, we look for a DQSWrDelay which does not give any errors with TrainReadDQS. Then we don't care about errors for other values of DQSWrDelay and use the current value of DQSWrDelay to run TrainWriteDQS. If TrainReadDQS failed for all values of DQSWrDelay, we return the bitwise OR of all error conditions we had for all values of DQSWrDelay. Does that really make sense?
For coreboot, it looks like the test patterns are just pushed onto the stack.
Indeed. So we are completely free to place CAR anywhere we want with any size we want (subject to L2 size restrictions).
For AMD BIOS code, this is not the case and they are put into the cache at a set location. (I think that this is easier for the AGESA asm code to handle that way).
I see.
Thanks for pointing me to the code. I shall add good comments to that code snippet once I have more time.
Regards, Carl-Daniel