On 10/11/18 11:29 AM, Christian Gmeiner wrote:
During the last weeks I found the root cause of my problem - PCIe spread spectrum
Our FPGAs need a stable 100MHz PCIE clock to work. The used FSP config thing looked like this:
void mainboard_memory_init_params(FSPM_UPD *mupd) { FSP_M_CONFIG *mem_cfg; struct spd_block blk = { .addr_map = { 0x50 }, };
mem_cfg = &mupd->FspmConfig; mem_cfg->PegDisableSpreadSpectrumClocking = 1; mem_cfg->PchPmPciePllSsc = 0; ...
}
With this configuration the PCIe reference clock was off more then 8% which caused the system to hang during cold and warm boots.
In the next step I removed assignment of PchPmPciePllSsc as it is documented as 'No BIOS override'. With this change I got more then 1000 soft and 2000 hard reboots without any problem. Keep in mind we started with only 10 successful reboots.
Please be more specific about the final setting of this UPD. `No BIOS override` is the documentation for the default value of 0xff. But is this set to the default in the binary? who knows...
The big problem is that PegDisableSpreadSpectrumClocking has no effect at all. I measured the freq it is not the 100MHz as expected. And I need to have a stable 100MHz this clock source is used internally by the FPGA to drive internal clocks. The end results is that EtherCAT is not able to sync.
This setting is about a different clock, I guess. Can you please clarify what is connected to which clock on your board.
Nico
Am Fr., 12. Okt. 2018 um 10:15 Uhr schrieb Nico Huber nico.h@gmx.de:
On 10/11/18 11:29 AM, Christian Gmeiner wrote:
During the last weeks I found the root cause of my problem - PCIe spread spectrum
Our FPGAs need a stable 100MHz PCIE clock to work. The used FSP config thing looked like this:
void mainboard_memory_init_params(FSPM_UPD *mupd) { FSP_M_CONFIG *mem_cfg; struct spd_block blk = { .addr_map = { 0x50 }, };
mem_cfg = &mupd->FspmConfig; mem_cfg->PegDisableSpreadSpectrumClocking = 1; mem_cfg->PchPmPciePllSsc = 0; ...
}
With this configuration the PCIe reference clock was off more then 8% which caused the system to hang during cold and warm boots.
In the next step I removed assignment of PchPmPciePllSsc as it is documented as 'No BIOS override'. With this change I got more then 1000 soft and 2000 hard reboots without any problem. Keep in mind we started with only 10 successful reboots.
Please be more specific about the final setting of this UPD. `No BIOS override` is the documentation for the default value of 0xff. But is this set to the default in the binary? who knows...
void mainboard_memory_init_params(FSPM_UPD *mupd) { FSP_M_CONFIG *mem_cfg; struct spd_block blk = { .addr_map = { 0x50 }, };
mem_cfg = &mupd->FspmConfig;
/* Disable PCIe Spread Spectrum Clocking */ printk(BIOS_ERR, "PegDisableSpreadSpectrumClocking: %x\n", mem_cfg->PegDisableSpreadSpectrumClocking); printk(BIOS_ERR, "PchPmPciePllSsc: %x\n", mem_cfg->PchPmPciePllSsc); mem_cfg->PegDisableSpreadSpectrumClocking = 1;
get_spd_smbus(&blk); dump_spd_info(&blk); assert(blk.spd_array[0][0] != 0);
mainboard_fill_dq_map_data(&mem_cfg->DqByteMapCh0); mainboard_fill_dqs_map_data(&mem_cfg->DqsMapCpu2DramCh0); mainboard_fill_rcomp_res_data(&mem_cfg->RcompResistor); mainboard_fill_rcomp_strength_data(&mem_cfg->RcompTarget);
mem_cfg->DqPinsInterleaved = TRUE; mem_cfg->MemorySpdDataLen = blk.len; mem_cfg->MemorySpdPtr00 = (uintptr_t) blk.spd_array[0]; }
And here is the output taken from cbmem -1:
.. FMAP: base = ff000000 size = 1000000 #areas = 4 FMAP: area RW_MRC_CACHE found @ a50000 (65536 bytes) MRC: no data in 'RW_MRC_CACHE' PegDisableSpreadSpectrumClocking: 0 PchPmPciePllSsc: ff SPD @ 0x50 SPD: module type is DDR4 ..
The big problem is that PegDisableSpreadSpectrumClocking has no effect at all. I measured the freq it is not the 100MHz as expected. And I need to have a stable 100MHz this clock source is used internally by the FPGA to drive internal clocks. The end results is that EtherCAT is not able to sync.
This setting is about a different clock, I guess. Can you please clarify what is connected to which clock on your board.
Our FPGAs are using the 100 MHz PCIe clock as input to drive internal clocks etc. One of these clock is used for EtherCAT. If spread spectrum is active we are around 100 MHz (worst measured freq was ~92 MHz) and as a result the internal FPGA clock is not stable/reliable --> EtherCAT sync fails.
If I use the following pattern: mem_cfg->PegDisableSpreadSpectrumClocking = 1; mem_cfg->PchPmPciePllSsc = 0;
I get a stable 100 MHz PCIe clock signal and everything works - except the device hangs after < 10 warm reboots. Looks like PCIe link training fails uncountable times (seen with protocol analyzer).