On 06/07/2016 13:04, Laszlo Ersek wrote:
Actually, I think there is a bug in KVM at the moment. I ran the following test:
- modified OVMF to set the MSR to value 0x5 on just the BSP
- booted an i440fx and a Q35 (SMM-enabled) OVMF guest
- checked "rdmsr -a 0x3a" in both
- ran "pm-suspend" in both guests, woke them
- repeated the rdmsr command
The result is that the BSP had the 0x5 MSR value both after cold boot and after S3 resume. So, KVM does not seem to implement clearing of the MSR.
I suspect that the KVM module is getting in the way. Try removing it or go lower-level. I did the following test:
- Applied the following patch to x86/s3.c from kvm-unit-tests:
diff --git a/x86/s3.c b/x86/s3.c index cef956e..0f2b48f 100644 --- a/x86/s3.c +++ b/x86/s3.c @@ -1,6 +1,7 @@ #include "libcflat.h" #include "x86/acpi.h" #include "asm/io.h" +#include "x86/processor.h"
u32* find_resume_vector_addr(void) { @@ -62,6 +63,7 @@ int main(int argc, char **argv) rtc_out(RTC_HOURS_ALARM, RTC_ALARM_DONT_CARE); rtc_out(RTC_REG_B, rtc_in(RTC_REG_B) | REG_B_AIE);
+ printf("Current value of MSR is 0x%x\nValue after S3 resume is ", (int)rdmsr(0x3a)); *(volatile int*)0 = 0; asm volatile("outw %0, %1" :: "a"((short)0x2400), "d"((short)fadt->pm1a_cnt_blk):"memory"); while(1) @@ -75,6 +77,13 @@ asm ( ".global resume_end\n" ".code16\n" "resume_start:\n" + "mov $0x3a, %ecx\n" + "rdmsr\n" + "mov $0x3f8, %dx\n" + "add $0x30, %al\n" + "out %al, %dx\n" + "mov $0xa, %al\n" + "out %al, %dx\n" "mov 0x0, %eax\n" "mov $0xf4, %dx\n" "out %eax, %dx\n"
- Compiled the latest SeaBIOS
- Applied the QEMU patches for LMCE
- ran the following:
../qemu/+build/x86_64-softmmu/qemu-system-x86_64 -display none \ --enable-kvm -m 512 -serial mon:stdio \ -cpu host,+vmx -kernel ./x86/s3.flat \ -global PIIX4_PM.disable_s3=0 -device isa-debug-exit,iobase=0xf4
and my output is:
enabling apic FACS is at 0x1ffe0000 resume vector addr is 0x1ffe000c copy resume code from 0x400340 PM1a event registers at 600 Current value of MSR is 0x5 Value after S3 resume is 0
I also tried using BITS from https://biosbits.org/downloads/bits-2073.zip with OVMF:
import bits cpu = bits.cpus()[0] bits.rdmsr(cpu, 0x3a)
0L
bits.wrmsr(cpu, 0x3a, 5)
True
bits.rdmsr(cpu, 0x3a)
5L
After a full system reset (I don't know if BITS can do S3! :)) the rdmsr gave zero again. Tracing KVM confirmed that the OVMF I used doesn't touch the MSR.
I checked kvm/next (currently at 196f20ca52e8c7281932663c348fa54b82d03914), and vmx_vcpu_reset() does not seem to zero vmx->msr_ia32_feature_control.
This is true, but QEMU does zero it.
Now, what I absolutely can't tell you is whether this zeroing should happen regardless of "init_event", or just for a specific value of "init_event". Whenever I look at "init_event", I have to track it down to the commit that added it, then locate all the commits that fixed it, then guess whether the SDM language "logical processor reset" implies a specific "init_event" value or not. So, I have no idea about "init_event".
It should be preserved by INIT, but not by reset or S3.
So I need to know whether those INIT-SIPI-SIPI sequences are supposed to clear the MSR -- in other words, whether I have to write patches that explicitly sustain the MSR across these IPIs.
No, INIT hardly changes any MSR.
Paolo