[coreboot] Wired problems with Intel skylake based board

Christian Gmeiner christian.gmeiner at gmail.com
Thu Oct 11 11:29:49 CEST 2018


Am Di., 25. Sep. 2018 um 12:27 Uhr schrieb Peter Stuge <peter at stuge.se>:
>
> Christian Gmeiner wrote:
> > Most of the time the system works as expected but from time to rebooting
> > the system fails completely.
>
> Only ever when rebooting, or does cold boot also fail sometimes?
>

Both fail.

> (Make a test system to cold boot your system in a loop.)
>
>
> > there are two FPGAs connected via PCIe to the system where one is used
> > to reset the system. The reset is done via SYS_RESET#.
>
> Are the FPGAs also reset by that?
>

Yes

> If yes, how long do they need to initialize to where HDL acts
> correctly on PCIe?
>

I would need to measure this but i would say less then 300ms.

>
> > Now I run into different kind of issues:
> > - pcie link training fails from time to time
>
> On both links, or only one of them? Can you tell?
>

This is hard to tell.. all I see that the link gets reset quite fast
and quite often.

>
> > - looks like PLTRST# of the sunrisepoint pch holds the system in reset
> >   for minutes.
>
> I don't know if there's enough PCH documentation to know exactly why
> it would do that - but I can imagine that it holds reset as long as
> some conditions are not met, I can also imagine that the FPGAs cause
> some undefined PCIe behavior in the PCH which happens to get it stuck
> in reset for a while.
>
>
> > Are there any hints to debug my issues?
>
> As always, try to isolate the problem.
>
> Can you completely remove one or ideally both FPGAs from the equation?
>
> You mention that one of them is used for reset. At least disable the
> other one, destructively if need be.
>

During the last weeks I found the root cause of my problem - PCIe
spread spectrum

Our FPGAs need a stable 100MHz PCIE clock to work. The used FSP config
thing looked
like this:

void mainboard_memory_init_params(FSPM_UPD *mupd)
{
    FSP_M_CONFIG *mem_cfg;
    struct spd_block blk = {
        .addr_map = { 0x50 },
    };

    mem_cfg = &mupd->FspmConfig;

    mem_cfg->PegDisableSpreadSpectrumClocking = 1;
    mem_cfg->PchPmPciePllSsc = 0;

    ...
}

With this configuration the PCIe reference clock was off more then 8% which
caused the system to hang during cold and warm boots.

In the next step I removed assignment of PchPmPciePllSsc as it is documented
as 'No BIOS override'. With this change I got more then 1000 soft and
2000 hard reboots
without any problem. Keep in mind we started with only 10 successful reboots.

The big problem is that PegDisableSpreadSpectrumClocking has no effect
at all. I measured
the freq it is not the 100MHz as expected. And I need to have a stable
100MHz this clock source
is used internally by the FPGA to drive internal clocks. The end
results is that EtherCAT is not
able to sync.

I tried to change the ICC config via MEI messaging but I am not able
to change the clock settings
even with an successful return code.

-- 
greets
--
Christian Gmeiner, MSc

https://christian-gmeiner.info



More information about the coreboot mailing list