[coreboot] KGPE-D16: Most severe hang failure sorted out + boot time measures

Daniel Kulesz daniel.ina1 at googlemail.com
Tue Mar 21 22:17:47 CET 2017


Hi Timothy,

On Tue, 21 Mar 2017 10:32:46 -0500
Timothy Pearson <tpearson at raptorengineering.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 03/12/2017 07:58 PM, Daniel Kulesz via coreboot wrote:
> > Hi folks,
> > 
> > as reported, the KGPE-D16 was mostly unusable for me in my 2x Opteron 6276 + 128 GB RAM configuration as it simply did not boot reliably - even with serial console debugging disabled completely. After experimenting with various config options and comparing my best "known half-working" config from earlier attempts, I finally found out that the hangs were related to the configuration and not to a specific coreboot version.
> > 
> > I attached the configs showing my current "reliable" setup (that survived 10 cold and 10 warm reboots without a single hangup!) and one of the previous "unreliable" setups which often needed several cold boots to successfully boot up once. There are several options which might be reposible for these hangs. Personally, I believe what helps is to completely disable the serial console and not just disable debugging to serial console.
> > 
> > As asked for previously, I also took some boot time measures from pressing the power button to "grub beep" in my 128 GB RAM configuration. Here they are:
> > 
> > vendor bios, unoptimized with iPXE setup: 59s
> > coreboot, current with the "reliable" config: 73s
> > coreboot, Jan 17 2017, with the "reliable" config: 91s
> > coreboot, current with the "unreliable" config: 131s
> > 
> > I assume that further investigation of the root cause could help to locate the real bug (like e.g. the setup of the serial console). Yet, I hope that having a "working-good" config will be useful for people suffering from the same issue as I did. For me, this setup is still far from being what I expected (memory is clocked too low and idle power consumption is 170W instead of 90W), but at least the machine boots up reliably every time now.
> > 
> > Cheers, Daniel
> > 
> 
> Could you verify something for me?  In internal tests it looks like
> setting CONFIG_SQUELCH_EARLY_SMP resolves the hang with the serial
> console enabled, but I need secondary verification of this due to the
> intermittent nature of the problem.  You seem to be hardest hit by the
> bug so your system should make a good test case.
> 
> Thanks!
> 

Unfortunately, I had the option already enabled when using the "config-unreliable" in my initial posting. So it looks like this setting is not effective in stopping the hangs.

Cheers, Daniel



More information about the coreboot mailing list