-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 06/15/2018 01:36 PM, qtux wrote:
On 15/06/18 13:51, Kyösti Mälkki wrote:
On Fri, Jun 15, 2018 at 2:14 PM, qtux mail@qtux.eu wrote:
Coreboot did work well, but froze sometimes when booting during the assigning resources step (more or less exactly after assigning the PCI 14.3 or PNP 002e.2 device, which happen to be close to each other inside the devicetree). I had to remove the power cord in order to be able to boot again (or to get the next random freeze...). Rarely, after such an recovery, I have got flooded by IOMMU warnings in Linux which would only disappear after another reboot.
Ah, that resume reboot-loop issue. The bit that tells to do S3 resume is a sticky register backed up by Vstb rail. With [2] you should not need to do full power-cycling at least. We should extend this work to other platforms.
I am not sure whether the term resume reboot-loop applies for my issue (side note: I used a serial connection to monitor the boot process):
Rebooting (via holding the power button for some seconds) after encountering a freeze (aka stopping at the assign resource step) resulted into having no output from serial at all. I could repeat this with no effect at all, the computer seemed to be dead. Only removing the power cord could solve the issue.
This issue could occur when rebooting but even when cold booting.
One of these boards had LPC related lockups. I think the solution was to disable serial console or to set console to low loglevel.
If that is the case I would suggest to alter the default board configuration to reflect this. Nevertheless, it seems strange to me that such a lockup disappeared after changing SPI chips.
Answers are inside the text. I forgot to mention that I am currently on commit 793ae846e8.
Let's take the parent of that, commit 4a027e6e -- the one you refer to only appears on gerrit review branch.
Kyösti
That is fine, sorry for the misleading commit hash.
Cheers, Matthias
I've actually done a lot of investigation of this issue over time, including work with a protocol analyzer. What happens is that at some point (not sure where or why) the SPI flash reads start occurring at addresses that are completely invalid -- basically, the board starts reading compressed ramstage data during early romstage execution. This could be down to an SPI misread somewhere, but more importantly it appears that:
1.) The SPI controller in the southbridge receives at least some leakage power from +5VSB, and retains state as a result while the PSU is plugged in. 2.) The board reset lines don't fully reset the SPI controller in the southbridge 3.) There is an as-of-yet undetermined command sequence that will hard lock the SPI controller.
Much of this had been mitigated by making sure the ROM bridge in the southbridge never switched to LPC Flash mode, and I'm surprised to hear these issues are still happening.
- -- Timothy Pearson Raptor Engineering +1 (415) 727-8645 (direct line) +1 (512) 690-0200 (switchboard) https://www.raptorengineering.com