Hi Kevin,
Yep, we're seeing this in practice (more on that follows). However you are absolutely right about call using a displacement (and i didn't know that!):
e8 c1 f5 ff ff call ed161 <pci_config_writel>
Right now i don't see anything like that in disassembly, but we're still kind of one compiler-generated absolute jump/call/fetch away from this happening..
Anyway, answering your question, yes we have reproduced a case where we are executing junk after returning from pci_config_writel emulation.
However the actual case is somewhat more complicated and i would appreciate your advise on how to fix it.
I think we have a problem right now if during emulation first pci config write in __make_bios_writeable_intel we decide to issue qmp system_reset. Based on what i could gather from crashed instances we have something like this:
1. seabios issues pci_config_writel to reprogram 0xF0000-0xFFFFF and 0xD0000 - 0xC7FFF. VCPU thread exits into qemu usermode to emulate this.
2. At this time we issue system_reset through qmp. When VCPU thread decided to return to KVM it releases BQL, main event loop sees system_reset, stops cpus and calls reset handlers. So, we're doing a soft reset.
3. Q35 ICH9 reset emulation does not reset PAMs to default values
4. Upon re-entering reset vector seabios checks for PAM0 & 0x10 and decides that we have enabled ram previously and does not jump to __make_bios_writeable_intel in high memory relying on the fact that code is in low memory already. However this assumption only holds true for 0xC8FFFF - 0xEFFFF because we didn't have a chance to reprogram those PAMs during previous runs. We also didn't memcpy anything previously, so we're now may execute junk from F-segment.
What do you think about this scenario? I would be happy to fix this but i would be happy to get your advise on how to proceed and if all of my assumptions are correct.
Also, maybe original fix is still useful since we're not relying on all jumps/calls to be relative anymore?
-Evgeny
On 13.12.2018 19:42, Kevin O'Connor wrote:
On Wed, Dec 12, 2018 at 04:45:08PM +0300, Evgeny Yakovlev wrote:
Currently make_bios_writable_intel will call __make_bios_writeable_intel from high rom memory by manually correcting its offset to make sure that we safely execute it while overriding memory mapping through PAMs
However we still may call code from low memory, when __make_bios_writeable_intel itself calls other code without manual pointer adjustments. Right now it calls pci_config_readl and pci_config_writel.
Consider this scenario: 0. Linker puts pci_config_writel in F-segment.
- first pci_config_writel is called to reprogram PAM0-3, which means
remap regions 0xF0000-0xFFFFF and 0xD0000 - 0xC7FFF. 2. second pci_config_writel is called to reprogram PAM4-7 but code in F-segment is no longer valid, including pci_config_writel.
The x86 instruction set uses relative function calls by default. So, a call to pci_config_writel() calls the copy of that function also located in 0xFFF00000.
Are you seeing an error in practice? It's known that __make_bios_writeable_intel() is an ugly hack - it's there because qemu doesn't support "write back" mode of the pam registers. So the code needs to run at a different location when making that area writable. It is specific to qemu, so we only need it to run okay on qemu.
-Kevin