Hi Sven,
On 11.06.21 00:55, Sven Semmler wrote:
this is my first time posting here and it is quite possible that I've overlooked something obvious. In that case please just point me to whatever I should have read and accept my apologies.
don't worry. If this were documented, I would have missed it too :)
On my ThinkPad T430 running Coreboot-4.8.1 as part of an Heads install, I see these error messages when turning on the PC:
mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank: 7: ee20000000 3110a mce: [Hardware Error]: TSC 0 ADDR fefe78c0 MISC 3880000086 mce: [Hardware Error]: PROCESSOR 0:306a9 TIME 1622589409 SOCKET 0 mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank: 8: ee20000000 3110a mce: [Hardware Error]: TSC 0 ADDR fefe7880 MISC 3880000086 mce: [Hardware Error]: PROCESSOR 0:306a9 TIME 1622589409 SOCKET 0 APIC 0 microcode 1f
I do not think it is an issue with the actual RAM:
Indeed, this is not about the actual (D)RAM. One can tell by the address already, 0xfefe.... this is part of what we call I/O hole, a region reserved from the memory address space for different purposes.
More specifically, 0xfefe0000..0xfeffffff is a range used for cache- as-ram (CAR) which is a mode where the processor cache is used as RAM before the actual DRAM is available.
I have seen these MCEs before, but never investigated. They might affect the stability of coreboot, but it seems less likely that they affect the running OS once the system succeeded to boot.
Intel's x86 Software Developer's Manual (SDM) should explain how to decode these MCEs.
Some things to test come to mind: Does it report the same addresses on every boot? If so, one could try write-read-test these addresses early, right after CAR is set up.
Nico