On 01/04/18 15:29, Vitaly Kuznetsov wrote:
Laszlo Ersek lersek@redhat.com writes:
In fact, the only writew() needs patching is in vp_notify(), when I replace it with 'asm volatile' everything works.
- Does it make a difference if you disable EPT in the L1 KVM
configuration? (EPT is probably primarily controlled by the CPU features exposed by L0 Hyper-V, and secondarily by the "ept" parameter of the "kvm_intel" module in L1.)
Asking about EPT because the virtio rings and descriptors are in RAM, accessing which in L2 should "normally" never trap to L1/L0. However (I *guess*), when those pages are accessed for the very first time in L2, they likely do trap, and then the EPT setting in L1 might make a difference.
Disabling EPT helps!
OK...
I also tried tracing L1 KVM and the difference between working and non-working cases seems to be:
- Working:
... <...>-51387 [014] 64765.695019: kvm_page_fault: address fe007000 error_code 182 <...>-51387 [014] 64765.695024: kvm_emulate_insn: 0:eca87: 66 89 14 30 <...>-51387 [014] 64765.695026: vcpu_match_mmio: gva 0xfe007000 gpa 0xfe007000 Write GPA <...>-51387 [014] 64765.695026: kvm_mmio: mmio write len 2 gpa 0xfe007000 val 0x0 <...>-51387 [014] 64765.695033: kvm_entry: vcpu 0 <...>-51387 [014] 64765.695042: kvm_exit: reason EPT_VIOLATION rip 0xeae17 info 181 306 <...>-51387 [014] 64765.695043: kvm_page_fault: address f0694 error_code 181 <...>-51387 [014] 64765.695044: kvm_entry: vcpu 0 ...
- Broken:
... <...>-38071 [014] 63385.241117: kvm_page_fault: address fe007000 error_code 182 <...>-38071 [014] 63385.241121: kvm_emulate_insn: 0:ecffb: 66 89 06 <...>-38071 [014] 63385.241123: vcpu_match_mmio: gva 0xfe007000 gpa 0xfe007000 Write GPA <...>-38071 [014] 63385.241124: kvm_mmio: mmio write len 2 gpa 0xfe007000 val 0x0 <...>-38071 [014] 63385.241143: kvm_entry: vcpu 0 <...>-38071 [014] 63385.241162: kvm_exit: reason EXTERNAL_INTERRUPT rip 0xecffe info 0 800000f6 <...>-38071 [014] 63385.241162: kvm_entry: vcpu 0 ...
The 'kvm_emulate_insn' difference is actually the diferent versions of 'mov' we get with the current code and with my 'asm volatile' version. What makes me wonder is where the 'EXTERNAL_INTERRUPT' (only seen in broken version) comes from.
I don't think said interrupt matters. I also don't think the MOV differences matter; after all, in both cases we end up with the identical
vcpu_match_mmio: gva 0xfe007000 gpa 0xfe007000 Write GPA kvm_mmio: mmio write len 2 gpa 0xfe007000 val 0x0
sequence.
Here's another random idea:
I'll admit that I have no clue how SeaBIOS uses SMM, but I found an earlier email from Paolo 886757208.6870637.1484133921200.JavaMail.zimbra@redhat.com where he wrote, "the main reason for it [i.e., SMM], is that it provides a safer way to access a PCI device's memory BARs". (SeaBIOS commit 55215cd425d36 seems to give some background.)
And that kind of access is what vp_notify()/writew() does, and I see "call32_smm" / "handle_smi" log entries in your thread starter, intermixed with "vp notify".
Down-stream we disabled SMM in SeaBIOS because we deemed the additional safety (see above) unnecessary for our limited BIOS service use cases (=mostly grub), while SMM caused obscure problems:
- https://bugzilla.redhat.com/show_bug.cgi?id=1378006 - https://bugzilla.redhat.com/show_bug.cgi?id=1425516
So... can you rebuild SeaBIOS with "CONFIG_USE_SMM=n"?
(If you originally encountered the strange behavior with downstream SeaBIOS, which already has CONFIG_USE_SMM=n, then please ignore...)
Thanks, Laszlo