Hello,
Years ago, I developed a BadRAM patch that enabled Linux to run in spite of broken memory chips. I am now contemplating making it go into coreboot.
http://rick.vanrein.org/linux/badram/
BadRAM uses a very terse list of address/mask pairs to describe faulty locations. I am talking to JEDEC, trying to get them to standardise a format for putting this in the SPD-EEPROMs on board of DIMMs. Reading these EEPROMs is a task best left to the BIOS, and CoreBoot could become a suitable implementation of BIOS-based (OS-independent) BadRAM.
After browsing through the coreboot code, it seems that the best approach to follow would be to call lb_remove_memory_range() on ranges that are faulty, after having included all the memory on each DIMM. CoreBoot would then deliver a memory map with more regions than in a usual setup. For example, if one row is marked bad, and it consists of 4096 columns, there may be as many as 4096 ranges marked bad. This would make the memory map expand -- but in a payload-compatible manner.
I am not sure how the memory map is used after booting... - does the memory map feed the e820 bios call (through openbios, say)? - will all payloads use CoreBoot's memory map, one way or another? - is the memory map reclaimed after booting, and if so when? - would a memory map of, say, 4097 entries (so, 81 kB) ever be problematic?
If the memory for an expanded memory map is wasted, it's probably not ideal to do the expansion in CoreBoot. In that case, would it be a reasonable solution to read the SPD-stored information about BadRAM patterns and move it to either the lb_memory_range structure or a separate struct that lists pairs of address/mask? CoreBoot is the best place to make such a translation.
The reason for putting the BadRAM address/mask pairs in SPD is that it carries the fault knowledge onto the DIMM itself, in a portable way. If this weren't so ideal, I'd propose putting the whole thing in NVRAM; that however would make the BadRAM patterns dedicated to a machine, and it would disable usage patterns where broken memory would be plugged in as if it were the most normal case in the world. I'd love to make the use of broken memory chips as commonplace as possible, so as to avoid the environmental damage caused by making (memory) chips.
Your responses are kindly welcomed.
Thanks,
Rick van Rein GroenGemak http://groengemak.nl/en/