I'm in the process of building a new computer based on the P8ZZ7-V board with a i7-3770 CPU. I am running into a odd behavior the CPU_LED is blinking when using more than 2 8GB RAM modules. This occurs with both the vendor BIOS and the Coreboot build. A friend has the same configuration (mainboard, cpu, 4x8GB memory) which is working just fine. Which make me wonder why my setup is failing. We have gathered serial output (below) to diagnose the issue.
Most notable is that for the failing memory configurations the '[DEBUG] Stored timings CRC16 mismatch.' message is returned, after which `[DEBUG] SPD probe channel0, slot1` is triggered, which seems to fail: `ERROR: SPD CRC failed!!!`
The resulting looping sequence in the logs:
<start of loop>
[NOTE ] coreboot-4.17 Fri Jun 3 03:10:05 UTC 2022 bootblock starting (log level: 7)... [DEBUG] FMAP: Found "FLASH" version 1.1 at 0x710000. [DEBUG] FMAP: base = 0xff800000 size = 0x800000 #areas = 4 [DEBUG] FMAP: area COREBOOT found @ 710200 (982528 bytes) [INFO ] CBFS: mcache @0xfeff0e00 built for 13 files, used 0x2dc of 0x4000 bytes [INFO ] CBFS: Found 'fallback/romstage' @0x80 size 0x14c38 in mcache @0xfeff0e2c [DEBUG] BS: bootblock times (exec / console): total (unknown) / 43 ms
[NOTE ] coreboot-4.17 Fri Jun 3 03:10:05 UTC 2022 romstage starting (log level: 7)... [DEBUG] SMBus controller enabled [DEBUG] Setting up static northbridge registers... done [DEBUG] Initializing Graphics... [DEBUG] Back from systemagent_early_init() [INFO ] Intel ME early init [INFO ] Intel ME firmware is ready [DEBUG] ME: Requested 0MB UMA [DEBUG] Starting native Platform init [DEBUG] DMI: Running at X4 @ 5000MT/s [DEBUG] FMAP: area RW_MRC_CACHE found @ 700000 (65536 bytes) [DEBUG] Stored timings CRC16 mismatch. [DEBUG] ECC supported: no ECC forced: no [INFO ] ECC RAM unsupported. [DEBUG] SPD probe channel0, slot0 [DEBUG] Revision : 11 [DEBUG] Type : b [DEBUG] Key : 2 [DEBUG] Banks : 8 [DEBUG] Capacity : 4 Gb [DEBUG] Supported voltages : 1.5V [DEBUG] SDRAM width : 8 [DEBUG] Bus extension : 0 bits [DEBUG] Bus width : 64 [DEBUG] FTB timings : yes [DEBUG] Optional features : DLL-Off_mode RZQ/7 RZQ/6 [DEBUG] Thermal features : PASR ext_temp_range [DEBUG] Thermal sensor : no [DEBUG] Standard SDRAM : yes [DEBUG] Rank1 Address bits : mirrored [DEBUG] DIMM Reference card: B [DEBUG] Manufacturer ID : 9e02 [DEBUG] Part number : CML16GX3M2A1600C [DEBUG] XMP Profile : 1 [DEBUG] Max DIMMs/channel : 1 [DEBUG] XMP Revision : 1.3 [DEBUG] Requested voltage : 1500 mV [DEBUG] XMP profile supports 1 DIMMs, but 2 DIMMs are installed. [WARN ] XMP maximum DIMMs will be ignored. [INFO ] Row addr bits : 16 [INFO ] Column addr bits : 10 [INFO ] Number of ranks : 2 [INFO ] DIMM Capacity : 8192 MB [INFO ] CAS latencies : 6 9 [INFO ] tCKmin : 1.250 ns [INFO ] tAAmin : 11.250 ns [INFO ] tWRmin : 15.000 ns [INFO ] tRCDmin : 11.250 ns [INFO ] tRRDmin : 7.500 ns [INFO ] tRPmin : 11.250 ns [INFO ] tRASmin : 30.000 ns [INFO ] tRCmin : 50.625 ns [INFO ] tRFCmin : 260.000 ns [INFO ] tWTRmin : 7.500 ns [INFO ] tRTPmin : 7.500 ns [INFO ] tFAWmin : 37.500 ns [INFO ] tCWLmin : 10.000 ns [INFO ] tCMDmin : 2 [DEBUG] channel[0] rankmap = 0x3 [DEBUG] SPD probe channel0, slot1 [DEBUG] ERROR: SPD CRC failed!!! [DEBUG] Revision : 11 [DEBUG] Type : b [DEBUG] Key : 2 [DEBUG] Banks : 8 [DEBUG] Capacity : 4 Gb [DEBUG] Supported voltages : 1.5V [DEBUG] SDRAM width : 8 [DEBUG] Bus extension : 0 bits [DEBUG] Bus width : 64
<end of loop, restarting Coreboot>
Mainboard: ASUS P8ZZ7-V CPU: i7-3770 RAM 4x8GB DDR3 Corsair CML16GX3M2A1600C9
Does anybody have advise what could be causing the `SPD CRC failed!!!` error? The next idea is to replace the CPU as the memory seems fine to swap around in most configurations. Other ideas are the motherboard, or something interfering with the readout.
Is there an easy way to perhaps hardcode my way out of it for now, knowing the RAM will remain there once configured so I could even doe a custom Coreboot build?
Best, Nico
On Wed, 27 Jul 2022 at 15:32, Nico Rikken nico@nicorikken.eu wrote:
I'm in the process of building a new computer based on the P8ZZ7-V board with a i7-3770 CPU. I am running into a odd behavior the CPU_LED is blinking when using more than 2 8GB RAM modules.
I was working on a P8Z77 Pro board with 3770 not so long ago and ran into a similar issue - 2x 8GB (1600) sticks worked fine but more than that it hung. I did not get round to getting console output or dug too deep into the why but tried physical fixes first. I tried various permutations and found that (2x8) 16Gb at 1600 was the limit. I * think * (and dont quote me, it was a while ago) I managed to get 32Gb (4x8) working with 1333 sticks. I could be having a memory issue though (pun intended). I was going to try to use MRC instead of native raminit but got sidetracked and never figured it out - this page mentioned the MRC binary systemagent-r6.bin https://doc.coreboot.org/mainboard/asus/p8z77-m_pro.html
Thanks Simon for the quick reply.
Not sure where the similarities end between the P8ZZ7-V and the P8Z77-M PRO.
The board manual even lists 4x 8GB 1866 C9 support: CORSAIR CMT32GX3M4X1866C9(XMP) 32GB (4x 8GB) DS - - 9-10-9-27 Via at https://dlcdnets.asus.com/pub/ASUS/mb/LGA1155/P8Z77-V/E7074_P8Z77-V.pdf
With CONFIG_NATIVE_RAMINIT_IGNORE_XMP_MAX_DIMMS=n the DIMM's should be trained at 1333MHz but that didn't work either.
In my case even the vendor bios wasn't capable of booting with the RAM, so I guess that is a red flag already.
I just tried a different CPU, with the exact same behavior as a result. So that most certainly rules out CPU failures. That leaves a combination of motherboard, RAM and firmware.
Are there methods to prevent the reset from occuring or hardcode a certain configuration to prevent training and the time it takes?
Best, Nico
I tried a a different motherboard with the same RAM DIMMs and that worked just fine. So it seems to be related to the motherboard. I still have the malfunctioning motherboard, so perhaps I'll do some debugging in the future.
Best, Nico
Nico Rikken wrote:
I tried a a different motherboard with the same RAM DIMMs and that worked just fine. So it seems to be related to the motherboard. I still have the malfunctioning motherboard, so perhaps I'll do some debugging in the future.
Do you have access to an oscilloscope or a (simple) logic analyzer, something like a Saleae Logic or even a basic Cypress CY3689 discovery kit, that can be used with sigrok?
It would be interesting to know what the SPD signals look like and also what bytes coreboot reads from the DIMMs on the problem board.
//Peter
Thanks Peter for the interest.
I don't have a logic analyzer lying around. I can contact my local hackerspace (of which I am not a member). Or if somebody close to Arnhem, NL is willing to help, that would be great as well. But if it doesn't cost to much (< 100 euro) I'm happy to just purchase one.
Any logic analyzer you can recommend for sporadic Coreboot usage? What kind of specs should it meet? I see a lot of those simple 24MHz 8CH models being offered cheaply and I have used on of those in the past, but I'm not sure if that would be sufficient. The Coreboot wiki doesn't list specific recommendations: https://www.coreboot.org/Developer_Manual/Tools
Do you have any pointers on where I should measure on the board to capture those signals?
It might take some months before I'll get to this, if ever, just to manage expectations. But if it helps the Coreboot project by debugging I'm willing to do my part.
Best, Nico
Hi Nico,
Most notable is that for the failing memory configurations the '[DEBUG] Stored timings CRC16 mismatch.' message is returned, after which `[DEBUG] SPD probe channel0, slot1` is triggered, which seems to fail: `ERROR: SPD CRC failed!!!`
Had a quick look at the code that prints this error message (src/device/dram/ddr3.c) and this is very likely due to bad SPD data/checksum on the DIMM on channel 0, slot 1. The SPD EEPROM has a checksum and that checksum doesn't seem match with the calculated one. So this probably isn't a problem with the board or the coreboot port for it.
I assume that there was no previous successful boot with the given RAM configuration, so the "Stored timings CRC16 mismatch" likely isn't the problem, but just means that the MRC cache contents are outdated and that the full SPD EEPROM contents have to be read from the DIMMs and a full DRAM training has to be done in this case.
I'm a bit surprised that only some of the DIMMs work despite them all being of the same type. Maybe try dumping and comparing the SPD EEPROM contents; maybe it's corrupted on one of the DIMMs. When the DIMMs are all identical, taking a good SPD image from one and flashing it to the corrupted one might be worth a try; make sure that the DIMMs are exactly the same before trying this though and always have backups of the previous contents.
Regards, Felix