the following patch was just integrated into master: commit 0f56ebbc8996cf760b6a2486f730f47664780568 Author: Sven Schnelle svens@stackframe.org Date: Fri Jun 17 20:47:08 2011 +0200
SMM: flush caches after disabling caching
Fixes spurious SMI crashes i've seen, and ACPI/SMM interaction. For reference, the mail i've sent to ML with the bugreport:
whenever i've docked/undocked the thinkpad from the docking station, i had to do that twice to get the action actually to happen.
First i thought that would be some error in the ACPI code. Here's a short explanation how docking/undocking works:
1) ACPI EC Event 0x37 Handler is executed (EC sends event 0x37 on dock) 2) _Q37 does a Trap(SMI_DOCK_CONNECT). Trap is declared as follows:
a) Store(Arg0, SMIF) // SMIF is in the GNVS Memory Range b) Store(0, 0x808) // Generates I/O Trap to SMM c) // SMM is executed d) Return (SMIF) // Return Result in SMIF
I've verified that a) is really executed with ACPI debugging in the Linux Kernel. It writes the correct value to GNVS Memory. After that, i've logged the SMIF value in SMM, which contains some random (or former) value of SMIF.
So i've added the GNVS area to /proc/mtrr which made things work. I've also tried a wbinvd() in SMM code, with the same result.
After reading the src/cpu/x86/smm/smmhandler.S code, i've recognized that it starts with:
movw $(smm_gdtptr16 - smm_handler_start + SMM_HANDLER_OFFSET), %bx data32 lgdt %cs:(%bx)
movl %cr0, %eax andl $0x7FFAFFD1, %eax /* PG,AM,WP,NE,TS,EM,MP = 0 */ orl $0x60000001, %eax /* CD, NW, PE = 1 */ movl %eax, %cr0
/* Enable protected mode */ data32 ljmp $0x08, $1f
...which disables caching in SMM code, but doesn't flush the cache.
So the problem is:
- the linux axpi write to the SMIF GNVS Area will be written to Cache, because GNVS is WB - the SMM code runs with cache disabled, and fetches SMIF directly from Memory, which is some other value
Possible Solutions:
- enable cache in SMM (yeah, cache poisoning...)
- flush caches in SMM (really expensive)
- mark GNVS as UC in Memory Map (will only work if OS really marks that Area as UC. Checked various vendor BIOSes, none of them are marking NVS as UC. So this seems rather uncommon.)
- flush only the cache line which contains GNVS. Would fix this particular problem, but users/developers could see other Bugs like this. And not everyone likes to debug such problems. So i won't like this solution.
Change-Id: Ie60bf91c5fd1491bc3452d5d9b7fc8eae39fd77a Signed-off-by: Sven Schnelle svens@stackframe.org
See http://review.coreboot.org/39 for details.
-gerrit