Kevin,
is it possible to save a few bytes, a pointer, across a reboot? I have tried to do this by allocating a memory chunk in the fsegement and storing the pointer there surrounded by 2 'magic' 32 bit values. When trying to find the magic values on reboot early in handle_post() it doesn't seem to find them anymore. Is there another memory segment where SeaBIOS could store the few bytes and find them again?
Regards,
Stefan
On Tue, Jan 09, 2018 at 10:00:44AM -0500, Stefan Berger wrote:
Kevin,
is it possible to save a few bytes, a pointer, across a reboot? I have tried to do this by allocating a memory chunk in the fsegement and storing the pointer there surrounded by 2 'magic' 32 bit values. When trying to find the magic values on reboot early in handle_post() it doesn't seem to find them anymore. Is there another memory segment where SeaBIOS could store the few bytes and find them again?
Didn't you have that implemented with the "Support Physical Presence Interface Spec" patches you made back in 2015?
Everything in low memory gets wiped out on a reboot. Any storage would have to be above 1M (or in a hardware register somewhere).
BTW, can we move this discussion onto one of the mailing lists?
-Kevin
On 01/09/2018 10:14 AM, Kevin O'Connor wrote:
On Tue, Jan 09, 2018 at 10:00:44AM -0500, Stefan Berger wrote:
Kevin,
is it possible to save a few bytes, a pointer, across a reboot? I have
tried to do this by allocating a memory chunk in the fsegement and storing the pointer there surrounded by 2 'magic' 32 bit values. When trying to find the magic values on reboot early in handle_post() it doesn't seem to find them anymore. Is there another memory segment where SeaBIOS could store the few bytes and find them again?
Didn't you have that implemented with the "Support Physical Presence Interface Spec" patches you made back in 2015?
Yes. Back then the bytes shared between BIOS and ACPI were located in a MMIO memory area of the TPM TIS, which was basically a hack to save the few bytes across reboot. This time we are trying to embed these bytes in the ACPI stream where it would be allocated similar to the log area for the TPM. Besides that there would be a QEMU ACPI table (with name 'QEMU') to get the address from for that memory area. An ACPI variable would also get that address and use it in the address field of OperationRegion(). This works fine. Once we reboot, the ACPI stream gets re-initialized and everything there is gone. However, if we can save that memory early on during boot and restore it back into the expected location after ACPI has been re-done, this also works (I know this because I can test this with a hard coded address where that share memory is every time on my machine). Problem is just finding the address to the share memory. A possibility would be to again abuse a device's memory area as before to now hold only those 4 bytes...
Another twist is that Intel's EDK2 also implements this but the data structure layout is different and they use SMM + SMIs etc.
https://github.com/tianocore/edk2/blob/master/SecurityPkg/Tcg/Tcg2Smm/Tpm.as...
QEMU would also be generating the ACPI for this UEFI I suppose. So now who needs to adapt to whom? And can EDK2 be adapted to do something different or should it remain as-is and SeaBIOS would have to work similarly as EDK2 does? I don't know much about SMM / SMIs and how it work unfortunately and whether it can work from the OS when ACPI raises an SMI. Any opinions ?
Everything in low memory gets wiped out on a reboot. Any storage would have to be above 1M (or in a hardware register somewhere).
BTW, can we move this discussion onto one of the mailing lists?
Sure. I had cc'ed SeaBIOS mailing list this time.
Stefan
-Kevin
On Tue, Jan 09, 2018 at 02:02:52PM -0500, Stefan Berger wrote:
On 01/09/2018 10:14 AM, Kevin O'Connor wrote:
On Tue, Jan 09, 2018 at 10:00:44AM -0500, Stefan Berger wrote:
is it possible to save a few bytes, a pointer, across a reboot? I have
tried to do this by allocating a memory chunk in the fsegement and storing the pointer there surrounded by 2 'magic' 32 bit values. When trying to find the magic values on reboot early in handle_post() it doesn't seem to find them anymore. Is there another memory segment where SeaBIOS could store the few bytes and find them again?
Didn't you have that implemented with the "Support Physical Presence Interface Spec" patches you made back in 2015?
Yes. Back then the bytes shared between BIOS and ACPI were located in a MMIO memory area of the TPM TIS, which was basically a hack to save the few bytes across reboot. This time we are trying to embed these bytes in the ACPI stream where it would be allocated similar to the log area for the TPM. Besides that there would be a QEMU ACPI table (with name 'QEMU') to get the address from for that memory area. An ACPI variable would also get that address and use it in the address field of OperationRegion().
I'm a bit confused by the above. QEMU dynamically generates the ACPI tables today. So, why go through the hoops above - why not just directly generate the ACPI table with whatever info you need? If you need to have some storage across boots, why not create a virtual device and have the ACPI code read/write to that virtual address? As I understand it, this is already done for other devices in QEMU.
-Kevin
Stefan,
On 01/09/18 20:02, Stefan Berger wrote:
Another twist is that Intel's EDK2 also implements this but the data structure layout is different and they use SMM + SMIs etc.
https://github.com/tianocore/edk2/blob/master/SecurityPkg/Tcg/Tcg2Smm/Tpm.as...
As I described in my investigation linked from https://bugzilla.tianocore.org/show_bug.cgi?id=594#c5, we should not include the Tcg2Smm driver in OVMF, for TPM enablement -- at least for the short & mid terms.
What does the Tcg2Smm driver do? In section (2f), I described that the driver installs two tables, "TPM2" and an "SSDT".
- The TPM2 table from this driver is unneeded, since QEMU generates its own TPM2 table, which describes the TPM device's access method -- TIS+Cancel (method 6).
- The SSDT from the driver is again unneeded. It provides (via the _DSM method) an ACPI-level API that the OS can use, for talking to the TPM device. An implementation detail of this ACPI method is that it raises an SMI, for entering the firmware at an elevated privilege level (= in SMM). Then, the actual TPM hardware manipulation, or even the TPM *software emulation*, is performed by the firmware, in SMM.
This approach is totally ill-suited for the QEMU virtualization stack. For starters, none of the firmware code exist -- as open source anyway -- that would actually handle such ACPI->SMM requests. Second, I'm sure we don't want to debug TPM software emulation running in SMM guest firmware, rather than an actual QEMU device model.
Once we have a real device model, accessed via IO ports and/or MMIO locations, perhaps in combination with request/response buffers allocated in guest RAM, the SMI/SMM implementation detail falls away completely. Our TPM emulation would attain its "privileged / protected" status simply by existing in the hypervisor (QEMU).
So here's what should be done:
- QEMU should implement the TPM device model, using TIS+Cancel (method 6) or CRB (method 7). These are collectively called "dTPM".
- QEMU should continue generating a TPM2 ACPI table, for describing one of the above access methods to the OS, as appropriate for the actual device model.
- OVMF should include the following drivers from edk2, without changes: - Tcg2Pei/Tcg2Pei.inf - Tcg2Dxe/Tcg2Dxe.inf
- OVMF should include the following drivers from edk2, - either verbatim (if they work out like that), - or with small customizations (if the drivers themselves offer sufficiently flexible knobs), - or else as modules duplicated / rewritten under OvmfPkg, - or they might even turn out unnecessary:
- Tcg2Config/Tcg2ConfigPei.inf - Tcg2Config/Tcg2ConfigDxe.inf
QEMU would also be generating the ACPI for this UEFI I suppose. So now who needs to adapt to whom? And can EDK2 be adapted to do something different or should it remain as-is and SeaBIOS would have to work similarly as EDK2 does? I don't know much about SMM / SMIs and how it work unfortunately and whether it can work from the OS when ACPI raises an SMI. Any opinions ?
To be honest, I don't understand SeaBIOS's role here (beyond executing the linker/loader script from QEMU). To my knowledge, SeaBIOS does not intend to be a TPM client. As far as I understand, only - UEFI applications, - and then the OS (UEFI-based, or traditional BIOS-based) are expected to function as TPM clients.
Under the approach described near the top,
- UEFI clients (such as UEFI boot loaders) are satisfied by the inclusion of the "Tcg2Dxe/Tcg2Dxe.inf" driver in OVMF -- because said driver produces the EFI_TCG2_PROTOCOL;
- and the OS (regardless of UEFI or traditional BIOS) is satisfied by finding the TPM hardware description in the TPM2 table of QEMU, and then by talking to the TPM device model (implemented in QEMU) with its own native driver.
So... I'm missing the point of the thread starter message -- "Saving a few bytes across a reboot". Save them for what purpose?
BTW, from the "TCG PC Client Platform TPM Profile (PTP) Specification", it seems like the FIFO (TIS) interface is hard-coded *in the spec* at FED4_0000h – FED4_4FFFh. So we don't even have to make that dynamic.
Regarding CRB (as an alternative to TIS+Cancel), I'm trying to wrap my brain around the exact resources that the CRB interface requries. Marc-André, can you summarize those?
Thanks, Laszlo
Hi
----- Original Message -----
BTW, from the "TCG PC Client Platform TPM Profile (PTP) Specification", it seems like the FIFO (TIS) interface is hard-coded *in the spec* at FED4_0000h – FED4_4FFFh. So we don't even have to make that dynamic.
Regarding CRB (as an alternative to TIS+Cancel), I'm trying to wrap my brain around the exact resources that the CRB interface requries. Marc-André, can you summarize those?
The device is a relatively simple MMIO-only device on the sysbus: https://github.com/stefanberger/qemu-tpm/commit/2f9d06f93b285d4b39966a808675...
The region is registered at the same address as TIS (it's not entirely clear from the spec it is supposed to be there, but my laptop tpm use the same). And it uses a size of 0x1000, although it's also unclear to me what should be the size of the command buffer (that size can also be defined at run-time now, iirc, I should adapt the code).
My experiments so far running some Windows tests indicate that for TPM2, CRB+UEFI is required (and I managed to get an ovmf build with TPM2 support). A few test failed, it seems the "Physical Presence Interface" (PPI) is also required. I think that ACPI interface allows to run TPM commands during reboot, by having the firmware taking care of the security aspects. I think that's what Stefan is working on for Seabios and the safe memory region (sorry I haven't read the whole discussion, as I am not working on TPM atm)
thanks
On 01/10/18 16:19, Marc-André Lureau wrote:
Hi
----- Original Message -----
BTW, from the "TCG PC Client Platform TPM Profile (PTP) Specification", it seems like the FIFO (TIS) interface is hard-coded *in the spec* at FED4_0000h FED4_4FFFh. So we don't even have to make that dynamic.
Regarding CRB (as an alternative to TIS+Cancel), I'm trying to wrap my brain around the exact resources that the CRB interface requries. Marc-André, can you summarize those?
The device is a relatively simple MMIO-only device on the sysbus: https://github.com/stefanberger/qemu-tpm/commit/2f9d06f93b285d4b39966a808675...
The region is registered at the same address as TIS (it's not entirely clear from the spec it is supposed to be there, but my laptop tpm use the same). And it uses a size of 0x1000, although it's also unclear to me what should be the size of the command buffer (that size can also be defined at run-time now, iirc, I should adapt the code).
Thank you -- so the "immediate" register block is in MMIO space, and (apparently) we can hard-code its physical address too.
My question is if we need to allocate guest RAM in addition to the register block, for the command buffer(s) that will transmit the requests/responses. I see the code you quote above says,
+ /* allocate ram in bios instead? */ + memory_region_add_subregion(get_system_memory(), + TPM_CRB_ADDR_BASE + sizeof(struct crb_regs), &s->cmdmem);
... and AFAICS your commit message poses the exact same question :)
Option 1: If we have enough room in MMIO space above the register block at 0xFED40000, then we could simply dump the CRB there too.
Option 2: If not (or we want to avoid Option 1 for another reason), then the linker/loader script has to make the guest fw allocate RAM, write the allocation address to the TPM2 table with an ADD_POINTER command, and write the address back to QEMU with a WRITE_POINTER command. Is my understanding correct?
I wonder why we'd want to bother with Option 2, since we have to place the register block at a fixed MMIO address anyway.
(My understanding is that the guest has to populate the CRB, and then kick the hypervisor, so at least the register used for kicking must be in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space (for platform devices). Thus, the register block must reside at a QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
My experiments so far running some Windows tests indicate that for TPM2, CRB+UEFI is required (and I managed to get an ovmf build with TPM2 support).
Awesome!
A few test failed, it seems the "Physical Presence Interface" (PPI) is also required.
Required for what goal, exactly?
I think that ACPI interface allows to run TPM commands during reboot, by having the firmware taking care of the security aspects.
Ugh :/ I mentioned those features in my earlier write-up, under points (2f2b) and (2f2c). I'm very unhappy about them. They are a *huge* mess for OVMF.
- They would require including (at least a large part of) the Tcg2Smm/Tcg2Smm.inf driver, with all the complications I described earlier as counter-arguments,
- they'd require including the MemoryOverwriteControl/TcgMor.inf driver,
- and they'd require some real difficult platform code in OVMF (e.g. PEI-phase access to non-volatile UEFI variables, which I've by now failed to upstream twice; PEI-phase access to all RAM; and more).
My personal opinion is that we should determine what goals require what TPM features, and then we should aim at a minimal set. If I understand correctly, PCRs and measurements already work (although the patches are not upstream yet) -- is that correct?
Personally I think the SSDT/_DSM-based features (TCG Hardware Information, TCG Memory Clear Interface, TCG Physical Presence Interface) are very much out of scope for "TPM Enablement".
I think that's what Stefan is working on for Seabios and the safe memory region (sorry I haven't read the whole discussion, as I am not working on TPM atm)
Yeah, with e.g. the "TCG Memory Clear Interface" feature pulled into the context -- from the "Platform Reset Attack Mitigation Specification" --, I do understand Stefan's question. Said feature is about the OS setting a flag in NVRAM, for the firmware to act upon, at next boot. "Saving a few bytes across a reboot" maps to that.
(And, as far as I understand this spec, it tells traditional BIOS implementors, "do whatever you want for implementing this NVRAM thingy", while to UEFI implementors, it says, "use exactly this and that non-volatile UEFI variable". Given this, I don't know how much commonality would be possible between SeaBIOS and OVMF.)
Similarly, about "TCG Physical Presence Interface" -- defined in the TCG Physical Presence Interface Specification --, I had written, "The OS can queue TPM operations (?) that require Physical Presence, and at next boot, [the firmware] would have to dispatch those pending operations."
That "queueing" maps to the same question (and NVRAM) again, yes.
Again, I'm unclear about any higher level goals / requirements here, but I think these "extras" from the Trusted Computing Group are way beyond TPM enablement.
Thanks Laszlo
On 01/10/2018 11:45 AM, Laszlo Ersek wrote:
On 01/10/18 16:19, Marc-André Lureau wrote:
Hi
----- Original Message -----
BTW, from the "TCG PC Client Platform TPM Profile (PTP) Specification", it seems like the FIFO (TIS) interface is hard-coded *in the spec* at FED4_0000h FED4_4FFFh. So we don't even have to make that dynamic.
Regarding CRB (as an alternative to TIS+Cancel), I'm trying to wrap my brain around the exact resources that the CRB interface requries. Marc-André, can you summarize those?
The device is a relatively simple MMIO-only device on the sysbus: https://github.com/stefanberger/qemu-tpm/commit/2f9d06f93b285d4b39966a808675...
The region is registered at the same address as TIS (it's not entirely clear from the spec it is supposed to be there, but my laptop tpm use the same). And it uses a size of 0x1000, although it's also unclear to me what should be the size of the command buffer (that size can also be defined at run-time now, iirc, I should adapt the code).
Thank you -- so the "immediate" register block is in MMIO space, and (apparently) we can hard-code its physical address too.
My question is if we need to allocate guest RAM in addition to the register block, for the command buffer(s) that will transmit the requests/responses. I see the code you quote above says,
- /* allocate ram in bios instead? */
- memory_region_add_subregion(get_system_memory(),
TPM_CRB_ADDR_BASE + sizeof(struct crb_regs), &s->cmdmem);
... and AFAICS your commit message poses the exact same question :)
Option 1: If we have enough room in MMIO space above the register block at 0xFED40000, then we could simply dump the CRB there too.
Option 2: If not (or we want to avoid Option 1 for another reason), then the linker/loader script has to make the guest fw allocate RAM, write the allocation address to the TPM2 table with an ADD_POINTER command, and write the address back to QEMU with a WRITE_POINTER command. Is my understanding correct?
I wonder why we'd want to bother with Option 2, since we have to place the register block at a fixed MMIO address anyway.
(My understanding is that the guest has to populate the CRB, and then kick the hypervisor, so at least the register used for kicking must be in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space (for platform devices). Thus, the register block must reside at a QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
My experiments so far running some Windows tests indicate that for TPM2, CRB+UEFI is required (and I managed to get an ovmf build with TPM2 support).
Awesome!
A few test failed, it seems the "Physical Presence Interface" (PPI) is also required.
Required for what goal, exactly?
I think that ACPI interface allows to run TPM commands during reboot, by having the firmware taking care of the security aspects.
Ugh :/ I mentioned those features in my earlier write-up, under points (2f2b) and (2f2c). I'm very unhappy about them. They are a *huge* mess for OVMF.
They would require including (at least a large part of) the Tcg2Smm/Tcg2Smm.inf driver, with all the complications I described earlier as counter-arguments,
they'd require including the MemoryOverwriteControl/TcgMor.inf driver,
and they'd require some real difficult platform code in OVMF (e.g. PEI-phase access to non-volatile UEFI variables, which I've by now failed to upstream twice; PEI-phase access to all RAM; and more).
My personal opinion is that we should determine what goals require what TPM features, and then we should aim at a minimal set. If I understand correctly, PCRs and measurements already work (although the patches are not upstream yet) -- is that correct?
Personally I think the SSDT/_DSM-based features (TCG Hardware Information, TCG Memory Clear Interface, TCG Physical Presence Interface) are very much out of scope for "TPM Enablement".
I think that's what Stefan is working on for Seabios and the safe memory region (sorry I haven't read the whole discussion, as I am not working on TPM atm)
Yeah, with e.g. the "TCG Memory Clear Interface" feature pulled into the context -- from the "Platform Reset Attack Mitigation Specification" --, I do understand Stefan's question. Said feature is about the OS setting a flag in NVRAM, for the firmware to act upon, at next boot. "Saving a few bytes across a reboot" maps to that.
I just posted the patches enabling a virtual memory device that helps save these few bytes across a reboot. I chose the same address as EDK2 does, 0xffff0000, in the hope that this address can be reserved for this purpose. It would be enabled for TPM TIS and the CRB through a simple function call. I think it should be part of TPM enablement, at least to have this device, since it adds 256 bytes that would need to be saved for VM suspend. And I would like to get to support suspend/resume with TPM TIS and external device, so it should be there before we do that.
(And, as far as I understand this spec, it tells traditional BIOS implementors, "do whatever you want for implementing this NVRAM thingy", while to UEFI implementors, it says, "use exactly this and that non-volatile UEFI variable". Given this, I don't know how much commonality would be possible between SeaBIOS and OVMF.) Similarly, about "TCG Physical Presence Interface" -- defined in the TCG Physical Presence Interface Specification --, I had written, "The OS can queue TPM operations (?) that require Physical Presence, and at next boot, [the firmware] would have to dispatch those pending operations."
That "queueing" maps to the same question (and NVRAM) again, yes.
The spec describes the ACPI interface but not the layout of the shared memory between ACPI and firmware. This is not a problem if the vendor of the firmware supplies ACPI code and firmware code, which they supposedly do. In QEMU case it's a bit different. I of course looked at EDK2 and adapted my ACPI code (and SeaBIOS) code to at least support the same layout of the shared memory, hoping that this would enable EDK2 C code. Not sure what is better, following their layout or invent my own (and be incompatible on purpose...)
Again, I'm unclear about any higher level goals / requirements here, but I think these "extras" from the Trusted Computing Group are way beyond TPM enablement.
See above why I think we should at least have the virtual memory device...
Thanks Laszlo
On 01/10/2018 10:19 AM, Marc-André Lureau wrote:
Hi
----- Original Message -----
BTW, from the "TCG PC Client Platform TPM Profile (PTP) Specification", it seems like the FIFO (TIS) interface is hard-coded *in the spec* at FED4_0000h – FED4_4FFFh. So we don't even have to make that dynamic.
Regarding CRB (as an alternative to TIS+Cancel), I'm trying to wrap my brain around the exact resources that the CRB interface requries. Marc-André, can you summarize those?
The device is a relatively simple MMIO-only device on the sysbus: https://github.com/stefanberger/qemu-tpm/commit/2f9d06f93b285d4b39966a808675...
The region is registered at the same address as TIS (it's not entirely clear from the spec it is supposed to be there, but my laptop tpm use the same). And it uses a size of 0x1000, although it's also unclear to me what should be the size of the command buffer (that size can also be defined at run-time now, iirc, I should adapt the code).
In the PTP spec. page 99: the I/O buffer is located from offsets 0x80 - 0xfff. This gives is a maximum of 3968 bytes. That's what you seem to be implementing.
https://www.trustedcomputinggroup.org/wp-content/uploads/PCClientPlatform-TP...
You are already calling:
tpm_backend_startup_tpm(s->tpmbe, CRB_CTRL_CMD_SIZE);
What you may want to do is like the TIS:
s->be_buffer_size = MIN(tpm_backend_get_buffer_size(s->be_driver), CRB_CTRL_CMD_SIZE);
[...]
tpm_backend_startup_tpm(s->tpmbe, se->be_buffer_size);
My experiments so far running some Windows tests indicate that for TPM2, CRB+UEFI is required (and I managed to get an ovmf build with TPM2 support). A few test failed, it seems the "Physical Presence Interface" (PPI) is also required. I think that ACPI interface allows to run TPM commands during reboot, by having the firmware taking care of the security aspects. I think that's what Stefan is working on for Seabios and the safe memory region (sorry I haven't read the whole discussion, as I am not working on TPM atm)
I am working on the PPI thing.
thanks
On 01/10/18 19:45, Stefan Berger wrote:
On 01/10/2018 11:45 AM, Laszlo Ersek wrote:
On 01/10/18 16:19, Marc-André Lureau wrote:
Hi
----- Original Message -----
BTW, from the "TCG PC Client Platform TPM Profile (PTP) Specification", it seems like the FIFO (TIS) interface is hard-coded *in the spec* at FED4_0000h FED4_4FFFh. So we don't even have to make that dynamic.
Regarding CRB (as an alternative to TIS+Cancel), I'm trying to wrap my brain around the exact resources that the CRB interface requries. Marc-André, can you summarize those?
The device is a relatively simple MMIO-only device on the sysbus: https://github.com/stefanberger/qemu-tpm/commit/2f9d06f93b285d4b39966a808675...
The region is registered at the same address as TIS (it's not entirely clear from the spec it is supposed to be there, but my laptop tpm use the same). And it uses a size of 0x1000, although it's also unclear to me what should be the size of the command buffer (that size can also be defined at run-time now, iirc, I should adapt the code).
Thank you -- so the "immediate" register block is in MMIO space, and (apparently) we can hard-code its physical address too.
My question is if we need to allocate guest RAM in addition to the register block, for the command buffer(s) that will transmit the requests/responses. I see the code you quote above says,
+ /* allocate ram in bios instead? */ + memory_region_add_subregion(get_system_memory(), + TPM_CRB_ADDR_BASE + sizeof(struct crb_regs), &s->cmdmem);
... and AFAICS your commit message poses the exact same question :)
Option 1: If we have enough room in MMIO space above the register block at 0xFED40000, then we could simply dump the CRB there too.
Option 2: If not (or we want to avoid Option 1 for another reason), then the linker/loader script has to make the guest fw allocate RAM, write the allocation address to the TPM2 table with an ADD_POINTER command, and write the address back to QEMU with a WRITE_POINTER command. Is my understanding correct?
I wonder why we'd want to bother with Option 2, since we have to place the register block at a fixed MMIO address anyway.
(My understanding is that the guest has to populate the CRB, and then kick the hypervisor, so at least the register used for kicking must be in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space (for platform devices). Thus, the register block must reside at a QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
My experiments so far running some Windows tests indicate that for TPM2, CRB+UEFI is required (and I managed to get an ovmf build with TPM2 support).
Awesome!
A few test failed, it seems the "Physical Presence Interface" (PPI) is also required.
Required for what goal, exactly?
I think that ACPI interface allows to run TPM commands during reboot, by having the firmware taking care of the security aspects.
Ugh :/ I mentioned those features in my earlier write-up, under points (2f2b) and (2f2c). I'm very unhappy about them. They are a *huge* mess for OVMF.
- They would require including (at least a large part of) the
Tcg2Smm/Tcg2Smm.inf driver, with all the complications I described earlier as counter-arguments,
they'd require including the MemoryOverwriteControl/TcgMor.inf driver,
and they'd require some real difficult platform code in OVMF (e.g.
PEI-phase access to non-volatile UEFI variables, which I've by now failed to upstream twice; PEI-phase access to all RAM; and more).
My personal opinion is that we should determine what goals require what TPM features, and then we should aim at a minimal set. If I understand correctly, PCRs and measurements already work (although the patches are not upstream yet) -- is that correct?
Personally I think the SSDT/_DSM-based features (TCG Hardware Information, TCG Memory Clear Interface, TCG Physical Presence Interface) are very much out of scope for "TPM Enablement".
I think that's what Stefan is working on for Seabios and the safe memory region (sorry I haven't read the whole discussion, as I am not working on TPM atm)
Yeah, with e.g. the "TCG Memory Clear Interface" feature pulled into the context -- from the "Platform Reset Attack Mitigation Specification" --, I do understand Stefan's question. Said feature is about the OS setting a flag in NVRAM, for the firmware to act upon, at next boot. "Saving a few bytes across a reboot" maps to that.
I just posted the patches enabling a virtual memory device that helps save these few bytes across a reboot. I chose the same address as EDK2 does, 0xffff0000, in the hope that this address can be reserved for this purpose. It would be enabled for TPM TIS and the CRB through a simple function call. I think it should be part of TPM enablement, at least to have this device, since it adds 256 bytes that would need to be saved for VM suspend. And I would like to get to support suspend/resume with TPM TIS and external device, so it should be there before we do that.
(And, as far as I understand this spec, it tells traditional BIOS implementors, "do whatever you want for implementing this NVRAM thingy", while to UEFI implementors, it says, "use exactly this and that non-volatile UEFI variable". Given this, I don't know how much commonality would be possible between SeaBIOS and OVMF.) Similarly, about "TCG Physical Presence Interface" -- defined in the TCG Physical Presence Interface Specification --, I had written, "The OS can queue TPM operations (?) that require Physical Presence, and at next boot, [the firmware] would have to dispatch those pending operations."
That "queueing" maps to the same question (and NVRAM) again, yes.
The spec describes the ACPI interface but not the layout of the shared memory between ACPI and firmware. This is not a problem if the vendor of the firmware supplies ACPI code and firmware code, which they supposedly do. In QEMU case it's a bit different. I of course looked at EDK2 and adapted my ACPI code (and SeaBIOS) code to at least support the same layout of the shared memory, hoping that this would enable EDK2 C code. Not sure what is better, following their layout or invent my own (and be incompatible on purpose...)
Again, I'm unclear about any higher level goals / requirements here, but I think these "extras" from the Trusted Computing Group are way beyond TPM enablement.
See above why I think we should at least have the virtual memory device...
I must say I don't yet know enough fine details about the edk2 stuff to confirm 100% that the above is future-proof (for edk2), but it certainly looks helpful to me (same address / structure, and leaving out SMM). Thanks for that!
Laszlo
On Wed, 10 Jan 2018 17:45:52 +0100 Laszlo Ersek lersek@redhat.com wrote:
On 01/10/18 16:19, Marc-André Lureau wrote:
Hi
----- Original Message -----
BTW, from the "TCG PC Client Platform TPM Profile (PTP) Specification", it seems like the FIFO (TIS) interface is hard-coded *in the spec* at FED4_0000h FED4_4FFFh. So we don't even have to make that dynamic.
Regarding CRB (as an alternative to TIS+Cancel), I'm trying to wrap my brain around the exact resources that the CRB interface requries. Marc-André, can you summarize those?
The device is a relatively simple MMIO-only device on the sysbus: https://github.com/stefanberger/qemu-tpm/commit/2f9d06f93b285d4b39966a808675...
The region is registered at the same address as TIS (it's not entirely clear from the spec it is supposed to be there, but my laptop tpm use the same). And it uses a size of 0x1000, although it's also unclear to me what should be the size of the command buffer (that size can also be defined at run-time now, iirc, I should adapt the code).
Thank you -- so the "immediate" register block is in MMIO space, and (apparently) we can hard-code its physical address too.
Fixed mapping is fine real hardware as systems tend to have more or less fixed configuration and fw is build specifically the board in question. That doesn't necessarily true for QEMU though, that's the reason why we have fwcfg and likes interfaces.
My question is if we need to allocate guest RAM in addition to the register block, for the command buffer(s) that will transmit the requests/responses. I see the code you quote above says,
- /* allocate ram in bios instead? */
- memory_region_add_subregion(get_system_memory(),
TPM_CRB_ADDR_BASE + sizeof(struct crb_regs), &s->cmdmem);
Michael used to reject any patches with explicitly mapped memory_regions (I recall nvdimm was trying to use something like this for DSM buffer). since it's not migration friendly (practically it's not possible move region when need arises) while with linker approach it guest allocated memory, its location address is migrated as part of device state and doesn't require any memory layout changes on QEMU side.
... and AFAICS your commit message poses the exact same question :)
Option 1: If we have enough room in MMIO space above the register block at 0xFED40000, then we could simply dump the CRB there too.
Option 2: If not (or we want to avoid Option 1 for another reason), then the linker/loader script has to make the guest fw allocate RAM, write the allocation address to the TPM2 table with an ADD_POINTER command, and write the address back to QEMU with a WRITE_POINTER command. Is my understanding correct?
I wonder why we'd want to bother with Option 2, since we have to place the register block at a fixed MMIO address anyway.
(My understanding is that the guest has to populate the CRB, and then kick the hypervisor, so at least the register used for kicking must be in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space (for platform devices). Thus, the register block must reside at a QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
MMIO doesn't have to be fixed nor exist at all, we could use linker write to file operation in FW for switching from guest to QEMU. That's obviously intrusive work for FW and QEMU compared to hardcodded address in both QEMU and FW but as benefit changes to QEMU and FW don't have to be tightly coupled and layout could be changed whenever need arises.
My experiments so far running some Windows tests indicate that for TPM2, CRB+UEFI is required (and I managed to get an ovmf build with TPM2 support).
Awesome!
A few test failed, it seems the "Physical Presence Interface" (PPI) is also required.
Required for what goal, exactly?
I think that ACPI interface allows to run TPM commands during reboot, by having the firmware taking care of the security aspects.
Ugh :/ I mentioned those features in my earlier write-up, under points (2f2b) and (2f2c). I'm very unhappy about them. They are a *huge* mess for OVMF.
They would require including (at least a large part of) the Tcg2Smm/Tcg2Smm.inf driver, with all the complications I described earlier as counter-arguments,
they'd require including the MemoryOverwriteControl/TcgMor.inf driver,
and they'd require some real difficult platform code in OVMF (e.g. PEI-phase access to non-volatile UEFI variables, which I've by now failed to upstream twice; PEI-phase access to all RAM; and more).
My personal opinion is that we should determine what goals require what TPM features, and then we should aim at a minimal set. If I understand correctly, PCRs and measurements already work (although the patches are not upstream yet) -- is that correct?
Personally I think the SSDT/_DSM-based features (TCG Hardware Information, TCG Memory Clear Interface, TCG Physical Presence Interface) are very much out of scope for "TPM Enablement".
I think that's what Stefan is working on for Seabios and the safe memory region (sorry I haven't read the whole discussion, as I am not working on TPM atm)
Yeah, with e.g. the "TCG Memory Clear Interface" feature pulled into the context -- from the "Platform Reset Attack Mitigation Specification" --, I do understand Stefan's question. Said feature is about the OS setting a flag in NVRAM, for the firmware to act upon, at next boot. "Saving a few bytes across a reboot" maps to that.
(And, as far as I understand this spec, it tells traditional BIOS implementors, "do whatever you want for implementing this NVRAM thingy", while to UEFI implementors, it says, "use exactly this and that non-volatile UEFI variable". Given this, I don't know how much commonality would be possible between SeaBIOS and OVMF.)
Similarly, about "TCG Physical Presence Interface" -- defined in the TCG Physical Presence Interface Specification --, I had written, "The OS can queue TPM operations (?) that require Physical Presence, and at next boot, [the firmware] would have to dispatch those pending operations."
That "queueing" maps to the same question (and NVRAM) again, yes.
Again, I'm unclear about any higher level goals / requirements here, but I think these "extras" from the Trusted Computing Group are way beyond TPM enablement.
Thanks Laszlo
On 01/11/18 13:40, Igor Mammedov wrote:
On Wed, 10 Jan 2018 17:45:52 +0100 Laszlo Ersek lersek@redhat.com wrote:
(My understanding is that the guest has to populate the CRB, and then kick the hypervisor, so at least the register used for kicking must be in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space (for platform devices). Thus, the register block must reside at a QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
MMIO doesn't have to be fixed nor exist at all, we could use linker write to file operation in FW for switching from guest to QEMU. That's obviously intrusive work for FW and QEMU compared to hardcodded address in both QEMU and FW but as benefit changes to QEMU and FW don't have to be tightly coupled and layout could be changed whenever need arises.
Marc-André wrote, "The [CRB] region is registered at the same address as TIS (it's not entirely clear from the spec it is supposed to be there, but my laptop tpm use the same)."
And, the spec declares the register block at the fixed range FED4_0000h-FED4_4FFFh.
How about this:
(1) stick with the TPM specs and implement the TIS and/or CRB interfaces,
(2) *except* make the base address of the register block a compat property for the QEMU device,
(3) generate data tables (TPM2) and AML tables (SSDT/_DSM) that expose the device to the guest OS as ACPI or ACPI+CRB (i.e., "fTPM"), *not* TIS and/or CRB
(4) in the generated ACPI payload, adhere to the compat property (i.e., generate the base address values from the compat prop),
(5) expose the base address stand-alone in a new fw_cfg file as well.
Benefits as I see it:
- register block can move around from one QEMU release to next,
- migration remains functional (ACPI comes from source host, but it matches the device model on the target host, due to the compat prop),
- firmware remains dumb about TPM activations (OS calls ACPI calls virtual hardware),
- the ACPI-to-hardware interface is dictated by an industry spec, so we don't have to invent and document a paravirtual interface. If it ever becomes necessary for the firmware to directly access the TPM hardware (for example, to replay physical presence commands queued by the OS), fw can rely on the same industry spec, only the base address has to be updated -- which is available stand-alone from the named fw_cfg file.
Thanks Laszlo
On 01/11/2018 09:02 AM, Laszlo Ersek wrote:
On 01/11/18 13:40, Igor Mammedov wrote:
On Wed, 10 Jan 2018 17:45:52 +0100 Laszlo Ersek lersek@redhat.com wrote:
(My understanding is that the guest has to populate the CRB, and then kick the hypervisor, so at least the register used for kicking must be in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space (for platform devices). Thus, the register block must reside at a QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
MMIO doesn't have to be fixed nor exist at all, we could use linker write to file operation in FW for switching from guest to QEMU. That's obviously intrusive work for FW and QEMU compared to hardcodded address in both QEMU and FW but as benefit changes to QEMU and FW don't have to be tightly coupled and layout could be changed whenever need arises.
Marc-André wrote, "The [CRB] region is registered at the same address as TIS (it's not entirely clear from the spec it is supposed to be there, but my laptop tpm use the same)."
And, the spec declares the register block at the fixed range FED4_0000h-FED4_4FFFh.
How about this:
(1) stick with the TPM specs and implement the TIS and/or CRB interfaces,
(2) *except* make the base address of the register block a compat property for the QEMU device,
(3) generate data tables (TPM2) and AML tables (SSDT/_DSM) that expose the device to the guest OS as ACPI or ACPI+CRB (i.e., "fTPM"), *not* TIS and/or CRB
Why? Linux doesn't use this type of interface. Actually, for the TIS the base address has been hard coded as well.
(4) in the generated ACPI payload, adhere to the compat property (i.e., generate the base address values from the compat prop),
(5) expose the base address stand-alone in a new fw_cfg file as well.
Benefits as I see it:
- register block can move around from one QEMU release to next,
Why would we need that? fed4_0000 is presumably reserved for TPM device interfaces and shouldn't clash with anything in the future. With the PPI memory at ffff_0000 - ffff_00ffI am not so sure. Here we could use the proposed QEMU ACPI table and a hard-coded address, ffff_0000 at the beginning. Would that not solve it? Why not?
migration remains functional (ACPI comes from source host, but it matches the device model on the target host, due to the compat prop),
firmware remains dumb about TPM activations (OS calls ACPI calls virtual hardware),
Linux doesn't use the ACPI interface from what I can tell.
What are 'TPM activations'? We have a TIS interface for example that SeaBIOS uses to initialize the TPM1.2 / TPM2.
- the ACPI-to-hardware interface is dictated by an industry spec, so we
Do you have a pointer to this spec?
don't have to invent and document a paravirtual interface. If it ever becomes necessary for the firmware to directly access the TPM hardware (for example, to replay physical presence commands queued by the OS), fw can rely on the same industry spec, only the base address has to be updated -- which is available stand-alone from the named fw_cfg file.
Thanks Laszlo
On Thu, 11 Jan 2018 09:29:14 -0500 Stefan Berger stefanb@linux.vnet.ibm.com wrote:
On 01/11/2018 09:02 AM, Laszlo Ersek wrote:
On 01/11/18 13:40, Igor Mammedov wrote:
On Wed, 10 Jan 2018 17:45:52 +0100 Laszlo Ersek lersek@redhat.com wrote:
(My understanding is that the guest has to populate the CRB, and then kick the hypervisor, so at least the register used for kicking must be in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space (for platform devices). Thus, the register block must reside at a QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
MMIO doesn't have to be fixed nor exist at all, we could use linker write to file operation in FW for switching from guest to QEMU. That's obviously intrusive work for FW and QEMU compared to hardcodded address in both QEMU and FW but as benefit changes to QEMU and FW don't have to be tightly coupled and layout could be changed whenever need arises.
Marc-André wrote, "The [CRB] region is registered at the same address as TIS (it's not entirely clear from the spec it is supposed to be there, but my laptop tpm use the same)."
And, the spec declares the register block at the fixed range FED4_0000h-FED4_4FFFh.
How about this:
(1) stick with the TPM specs and implement the TIS and/or CRB interfaces,
(2) *except* make the base address of the register block a compat property for the QEMU device,
(3) generate data tables (TPM2) and AML tables (SSDT/_DSM) that expose the device to the guest OS as ACPI or ACPI+CRB (i.e., "fTPM"), *not* TIS and/or CRB
Why? Linux doesn't use this type of interface. Actually, for the TIS the base address has been hard coded as well.
(4) in the generated ACPI payload, adhere to the compat property (i.e., generate the base address values from the compat prop),
(5) expose the base address stand-alone in a new fw_cfg file as well.
Benefits as I see it:
- register block can move around from one QEMU release to next,
Why would we need that? fed4_0000 is presumably reserved for TPM device interfaces and shouldn't clash with anything in the future.
I wouldn't bet on not clashing as it's a separate spec and another non TPM spec could use this address as well. Laszlo's suggestion to use fwcfg file for MMIO base should take care in case base would need to be changed.
With the PPI memory at ffff_0000 - ffff_00ffI am not so sure. Here we could use the proposed QEMU ACPI table
and a hard-coded address, ffff_0000 at the beginning.
if we would have hard codded address for starters, it would mean that layout can't be changed since old firmware /with fixed address/ will break when address is changed on QEMU side, hence QEMU would have to maintain initially fixed address practically forever.
Would that not solve it? Why not?
migration remains functional (ACPI comes from source host, but it matches the device model on the target host, due to the compat prop),
firmware remains dumb about TPM activations (OS calls ACPI calls virtual hardware),
Linux doesn't use the ACPI interface from what I can tell.
What are 'TPM activations'? We have a TIS interface for example that SeaBIOS uses to initialize the TPM1.2 / TPM2.
- the ACPI-to-hardware interface is dictated by an industry spec, so we
Do you have a pointer to this spec?
don't have to invent and document a paravirtual interface. If it ever becomes necessary for the firmware to directly access the TPM hardware (for example, to replay physical presence commands queued by the OS), fw can rely on the same industry spec, only the base address has to be updated -- which is available stand-alone from the named fw_cfg file.
Thanks Laszlo
On 01/11/2018 10:52 AM, Igor Mammedov wrote:
On Thu, 11 Jan 2018 09:29:14 -0500 Stefan Berger stefanb@linux.vnet.ibm.com wrote:
On 01/11/2018 09:02 AM, Laszlo Ersek wrote:
On 01/11/18 13:40, Igor Mammedov wrote:
On Wed, 10 Jan 2018 17:45:52 +0100 Laszlo Ersek lersek@redhat.com wrote:
(My understanding is that the guest has to populate the CRB, and then kick the hypervisor, so at least the register used for kicking must be in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space (for platform devices). Thus, the register block must reside at a QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
MMIO doesn't have to be fixed nor exist at all, we could use linker write to file operation in FW for switching from guest to QEMU. That's obviously intrusive work for FW and QEMU compared to hardcodded address in both QEMU and FW but as benefit changes to QEMU and FW don't have to be tightly coupled and layout could be changed whenever need arises.
Marc-André wrote, "The [CRB] region is registered at the same address as TIS (it's not entirely clear from the spec it is supposed to be there, but my laptop tpm use the same)."
And, the spec declares the register block at the fixed range FED4_0000h-FED4_4FFFh.
How about this:
(1) stick with the TPM specs and implement the TIS and/or CRB interfaces,
(2) *except* make the base address of the register block a compat property for the QEMU device,
(3) generate data tables (TPM2) and AML tables (SSDT/_DSM) that expose the device to the guest OS as ACPI or ACPI+CRB (i.e., "fTPM"), *not* TIS and/or CRB
Why? Linux doesn't use this type of interface. Actually, for the TIS the base address has been hard coded as well.
(4) in the generated ACPI payload, adhere to the compat property (i.e., generate the base address values from the compat prop),
(5) expose the base address stand-alone in a new fw_cfg file as well.
Benefits as I see it:
- register block can move around from one QEMU release to next,
Why would we need that? fed4_0000 is presumably reserved for TPM device interfaces and shouldn't clash with anything in the future.
I wouldn't bet on not clashing as it's a separate spec and another non TPM spec could use this address as well.
I don't think this would happen. I am not sure who or whether someone keeps track of device's addresses, but the TIS has been at that address for ages now and would interfer with other current and future devices.
Laszlo's suggestion to use fwcfg file for MMIO base should take care in case base would need to be changed.
Linux hard coded the address. It should take it from an ACPI table as well, if there was a table that provided the address but since it's a standard address there's probably none. Ff we were to move it somewhere else, we'd become incompatible will all Linux versions up to now, something we probably don't want.
With the PPI memory at ffff_0000 - ffff_00ffI am not so sure. Here we could use the proposed QEMU ACPI table and a hard-coded address, ffff_0000 at the beginning.
if we would have hard codded address for starters, it would mean that layout can't be changed since old firmware /with fixed address/ will break when address is changed on QEMU side, hence QEMU would have to maintain initially fixed address practically forever.
Since the PPI shared memory is not located in the ACPI stream and re-allocated and overwritten upon reboot, we can introduce a QEMU ACPI table that holds a pointer to that memory, where that pointer would be initialized from a constant. ACPI generating code in QEMU would also initialize the OperationRegion() from that constant. That way we could move this memory to wherever we want to in QEMU releases.
(I'm not trying to further argue for the idea below, just to clarify it:)
On 01/11/18 15:29, Stefan Berger wrote:
On 01/11/2018 09:02 AM, Laszlo Ersek wrote:
On 01/11/18 13:40, Igor Mammedov wrote:
On Wed, 10 Jan 2018 17:45:52 +0100 Laszlo Ersek lersek@redhat.com wrote:
(My understanding is that the guest has to populate the CRB, and then kick the hypervisor, so at least the register used for kicking must be in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space (for platform devices). Thus, the register block must reside at a QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
MMIO doesn't have to be fixed nor exist at all, we could use linker write to file operation in FW for switching from guest to QEMU. That's obviously intrusive work for FW and QEMU compared to hardcodded address in both QEMU and FW but as benefit changes to QEMU and FW don't have to be tightly coupled and layout could be changed whenever need arises.
Marc-André wrote, "The [CRB] region is registered at the same address as TIS (it's not entirely clear from the spec it is supposed to be there, but my laptop tpm use the same)."
And, the spec declares the register block at the fixed range FED4_0000h-FED4_4FFFh.
How about this:
(1) stick with the TPM specs and implement the TIS and/or CRB interfaces,
(2) *except* make the base address of the register block a compat property for the QEMU device,
(3) generate data tables (TPM2) and AML tables (SSDT/_DSM) that expose the device to the guest OS as ACPI or ACPI+CRB (i.e., "fTPM"), *not* TIS and/or CRB
Why? Linux doesn't use this type of interface. Actually, for the TIS the base address has been hard coded as well.
The idea would be to hide the actual address from the OS. Let the OS go through the ACPI methods only, and keep the ACPI constants in sync with the device model.
(4) in the generated ACPI payload, adhere to the compat property (i.e., generate the base address values from the compat prop),
(5) expose the base address stand-alone in a new fw_cfg file as well.
Benefits as I see it:
- register block can move around from one QEMU release to next,
Why would we need that?
It's not a requirement that I'm presenting -- I took the requirement as a given and attempted to satisfy it.
fed4_0000 is presumably reserved for TPM device interfaces and shouldn't clash with anything in the future. With the PPI memory at ffff_0000 - ffff_00ffI am not so sure. Here we could use the proposed QEMU ACPI table and a hard-coded address, ffff_0000 at the beginning. Would that not solve it? Why not?
- migration remains functional (ACPI comes from source host, but it
matches the device model on the target host, due to the compat prop),
- firmware remains dumb about TPM activations (OS calls ACPI calls
virtual hardware),
Linux doesn't use the ACPI interface from what I can tell.
What are 'TPM activations'?
I coined this expression for "interacting with the TPM device". I used this expression because the TPM ACPI spec uses the expression "activation methods" for describing the various ways to interact with the device (TIS, CRB, ACPI, ACPI+CRB are four methods that we've been discussing).
So above I meant that the firmware does not participate in OS->TPM requests.
We have a TIS interface for example that SeaBIOS uses to initialize the TPM1.2 / TPM2.
- the ACPI-to-hardware interface is dictated by an industry spec, so we
Do you have a pointer to this spec?
I simply meant that a TIS client would have to be written in AML (generated by QEMU). To the OS the device would be available via ACPI or ACPI+CRB activation, but to the ACPI implementation itself, it would look like a TIS or CRB device, with a moveable base address. This way the OS would be separated from the base address (because the OS would have to go through ACPI), and the firmware could reuse existent TIS drivers with hopefully minimal customization (base address taken from fw_cfg).
So, by the above industry spec, I simply meant the TIS interface.
Anyway, based on your description, there's a disconnect between the Linux guest and the base address movability requirement:
- we have four activation methods: TIS+Cancel, CRB, ACPI, ACPI+CRB - of this, Linux only supports the first two (TIS+Cancel, CRB), IIUC - in addition, Linux hard-codes the MMIO base address for both TIS+Cancel and CRB (at the spec-given address) - Windows cannot consume TIS+Cancel directly (according to research done by Marc-André, if I understand correctly), but it supports CRB, ACPI, ACPI+CRB - so the intersection is "CRB with hard-coded MMIO base address".
Thanks Laszlo
don't have to invent and document a paravirtual interface. If it ever becomes necessary for the firmware to directly access the TPM hardware (for example, to replay physical presence commands queued by the OS), fw can rely on the same industry spec, only the base address has to be updated -- which is available stand-alone from the named fw_cfg file.
Thanks Laszlo
On 01/11/2018 11:44 AM, Laszlo Ersek wrote:
(I'm not trying to further argue for the idea below, just to clarify it:)
On 01/11/18 15:29, Stefan Berger wrote:
On 01/11/2018 09:02 AM, Laszlo Ersek wrote:
On 01/11/18 13:40, Igor Mammedov wrote:
On Wed, 10 Jan 2018 17:45:52 +0100 Laszlo Ersek lersek@redhat.com wrote:
(My understanding is that the guest has to populate the CRB, and then kick the hypervisor, so at least the register used for kicking must be in MMIO (or IO) space. And firmware cannot allocate MMIO or IO space (for platform devices). Thus, the register block must reside at a QEMU-determined GPA. Once we do that, why bother about RAM allocation?)
MMIO doesn't have to be fixed nor exist at all, we could use linker write to file operation in FW for switching from guest to QEMU. That's obviously intrusive work for FW and QEMU compared to hardcodded address in both QEMU and FW but as benefit changes to QEMU and FW don't have to be tightly coupled and layout could be changed whenever need arises.
Marc-André wrote, "The [CRB] region is registered at the same address as TIS (it's not entirely clear from the spec it is supposed to be there, but my laptop tpm use the same)."
And, the spec declares the register block at the fixed range FED4_0000h-FED4_4FFFh.
How about this:
(1) stick with the TPM specs and implement the TIS and/or CRB interfaces,
(2) *except* make the base address of the register block a compat property for the QEMU device,
(3) generate data tables (TPM2) and AML tables (SSDT/_DSM) that expose the device to the guest OS as ACPI or ACPI+CRB (i.e., "fTPM"), *not* TIS and/or CRB
Why? Linux doesn't use this type of interface. Actually, for the TIS the base address has been hard coded as well.
The idea would be to hide the actual address from the OS. Let the OS go through the ACPI methods only, and keep the ACPI constants in sync with the device model.
(4) in the generated ACPI payload, adhere to the compat property (i.e., generate the base address values from the compat prop),
(5) expose the base address stand-alone in a new fw_cfg file as well.
Benefits as I see it:
- register block can move around from one QEMU release to next,
Why would we need that?
It's not a requirement that I'm presenting -- I took the requirement as a given and attempted to satisfy it.
fed4_0000 is presumably reserved for TPM device interfaces and shouldn't clash with anything in the future. With the PPI memory at ffff_0000 - ffff_00ffI am not so sure. Here we could use the proposed QEMU ACPI table and a hard-coded address, ffff_0000 at the beginning. Would that not solve it? Why not?
migration remains functional (ACPI comes from source host, but it matches the device model on the target host, due to the compat prop),
firmware remains dumb about TPM activations (OS calls ACPI calls virtual hardware),
Linux doesn't use the ACPI interface from what I can tell.
What are 'TPM activations'?
I coined this expression for "interacting with the TPM device". I used this expression because the TPM ACPI spec uses the expression "activation methods" for describing the various ways to interact with the device (TIS, CRB, ACPI, ACPI+CRB are four methods that we've been discussing).
So above I meant that the firmware does not participate in OS->TPM requests.
We have a TIS interface for example that SeaBIOS uses to initialize the TPM1.2 / TPM2.
- the ACPI-to-hardware interface is dictated by an industry spec, so we
Do you have a pointer to this spec?
I simply meant that a TIS client would have to be written in AML (generated by QEMU). To the OS the device would be available via ACPI or
But for that we would need an official spec. I haven't seen a spec that describes it. Maybe EDK2 has such a driver, but this may be one written without a public spec.
ACPI+CRB activation, but to the ACPI implementation itself, it would look like a TIS or CRB device, with a moveable base address. This way the OS would be separated from the base address (because the OS would have to go through ACPI), and the firmware could reuse existent TIS drivers with hopefully minimal customization (base address taken from fw_cfg).
The OS could be separated by telling where the base address is via an ACPI table. TPM 2 does that but the other specs for TIS say it's at fed4 0000. The CRB may be more flexible, but
TIS: https://trustedcomputinggroup.org/wp-content/uploads/TCG_PCClientTPMInterfac...
See page 34, also table 6, etc
TIS + CRB: https://trustedcomputinggroup.org/wp-content/uploads/TCG_PC_Client_Platform_...
Page 34, also table 7, etc
Also of interest is Table 10. Table 13 shows that the Interface Identifier Register indicates whether TIS and / or CRB are implemented and either one can be selected per writing to bits 18:17.
So, by the above industry spec, I simply meant the TIS interface.
Anyway, based on your description, there's a disconnect between the Linux guest and the base address movability requirement:
- we have four activation methods: TIS+Cancel, CRB, ACPI, ACPI+CRB
- of this, Linux only supports the first two (TIS+Cancel, CRB), IIUC
- in addition, Linux hard-codes the MMIO base address for both TIS+Cancel and CRB (at the spec-given address)
Actually it only hard codes it when one forces the device driver in. Otherwise it retrieves it (probably) via ACPI by calling platform_get_resource(), which probably passes back what ACPI defines via Memory32Fixed().
We set the address via a constant when building the ACPI.
- Windows cannot consume TIS+Cancel directly (according to research done by Marc-André, if I understand correctly), but it supports CRB, ACPI, ACPI+CRB
I believe that's true for recent versions of it. Many years ago, when I was working on Xen, TIS was detected by Windows.
I am not sure where the ACPI for this is spec'ed.
- so the intersection is "CRB with hard-coded MMIO base address".
I can only point to the standard for the address. If QEMU has an API where we can first try to allocate fed4 0000 and if that fails ask for another address, then we can use that. But does driver initialization work that way that we can first let all other devices register their MMIO requirements and then the TPM device ask whether fed4 0000 is available and then falls back to using a random address?
Stefan
Thanks Laszlo
don't have to invent and document a paravirtual interface. If it ever becomes necessary for the firmware to directly access the TPM hardware (for example, to replay physical presence commands queued by the OS), fw can rely on the same industry spec, only the base address has to be updated -- which is available stand-alone from the named fw_cfg file.
Thanks Laszlo
On 01/11/18 18:16, Stefan Berger wrote:
I can only point to the standard for the address. If QEMU has an API where we can first try to allocate fed4 0000 and if that fails ask for another address, then we can use that. But does driver initialization work that way that we can first let all other devices register their MMIO requirements and then the TPM device ask whether fed4 0000 is available and then falls back to using a random address?
As far as I understand, QEMU would keep the base address generally fixed, but it could be moved if (a) another platform device comes along that needs a large contiguous area and it cannot be accommodated without moving other devices around, or (b) the user wanted to move the address on the command line for whatever reason.
So, I don't think the QEMU API that you describe exists, or that there's a use case for it. AFAICT board code is expected to place platform devices up-front so that the latter peacefully co-exist.
Thanks, Laszlo
On 01/11/2018 12:38 PM, Laszlo Ersek wrote:
On 01/11/18 18:16, Stefan Berger wrote:
I can only point to the standard for the address. If QEMU has an API where we can first try to allocate fed4 0000 and if that fails ask for another address, then we can use that. But does driver initialization work that way that we can first let all other devices register their MMIO requirements and then the TPM device ask whether fed4 0000 is available and then falls back to using a random address?
As far as I understand, QEMU would keep the base address generally fixed, but it could be moved if (a) another platform device comes along that needs a large contiguous area and it cannot be accommodated without moving other devices around, or (b) the user wanted to move the address on the command line for whatever reason.
(a) can be handled by changing a #define (b) would be a bit more work.
I am not sure whether there's a registry for base addresses for devices in the industry. Maybe it's just common knowledge that fed4 0000 per TCG spec is a standard base address for the TPM hardware interface.
A comment on the PPI device would be great. I would like to coordinate that with folks here before I make changes. The idea of the QEMU ACPI table that holds the base address of this device looks appealing to me unless there's that registry of base addresses and 0xffff 0000 .. 0xffff 00ff defines a well known PPI memory device.
So, I don't think the QEMU API that you describe exists, or that there's a use case for it. AFAICT board code is expected to place platform devices up-front so that the latter peacefully co-exist.
I want to add that TPM TIS works fine with Win10 and attached TPM 1.2 but not TPM 2, which seems to want a CRB.
Stefan
Thanks, Laszlo
On 01/10/2018 08:22 AM, Laszlo Ersek wrote:
Stefan,
On 01/09/18 20:02, Stefan Berger wrote:
Another twist is that Intel's EDK2 also implements this but the data structure layout is different and they use SMM + SMIs etc.
https://github.com/tianocore/edk2/blob/master/SecurityPkg/Tcg/Tcg2Smm/Tpm.as...
As I described in my investigation linked from https://bugzilla.tianocore.org/show_bug.cgi?id=594#c5, we should not include the Tcg2Smm driver in OVMF, for TPM enablement -- at least for the short & mid terms.
What does the Tcg2Smm driver do? In section (2f), I described that the driver installs two tables, "TPM2" and an "SSDT".
The TPM2 table from this driver is unneeded, since QEMU generates its own TPM2 table, which describes the TPM device's access method -- TIS+Cancel (method 6).
The SSDT from the driver is again unneeded. It provides (via the _DSM method) an ACPI-level API that the OS can use, for talking to the TPM device. An implementation detail of this ACPI method is that it raises an SMI, for entering the firmware at an elevated privilege level (= in SMM). Then, the actual TPM hardware manipulation, or even the TPM *software emulation*, is performed by the firmware, in SMM.
This approach is totally ill-suited for the QEMU virtualization stack. For starters, none of the firmware code exist -- as open source anyway -- that would actually handle such ACPI->SMM requests. Second, I'm sure we don't want to debug TPM software emulation running in SMM guest firmware, rather than an actual QEMU device model.
Once we have a real device model, accessed via IO ports and/or MMIO locations, perhaps in combination with request/response buffers allocated in guest RAM, the SMI/SMM implementation detail falls away completely. Our TPM emulation would attain its "privileged / protected" status simply by existing in the hypervisor (QEMU).
Regarding the SMI/SMM: I think it will be needed for the TPM Physical Presence interface where ACPI gets a code from the user that it sends to the firmware and the firmware acts upon next reboot. SMM stores this code in a UEFI variable (EDK2) to protect it from modules executed by UEFI. I was trying to use a memory area (PPI memory device) for storing this code but it would not give the same protection for UEFI compared to the variable. I suppose the reason is that UEFI can execute (untrusted) code that could manipulate this memory area and cause unwanted changes to the TPM upon reboot by for example writing a code for clearing the TPM. How 'safe' would the BIOS be or any path from the BIOS until the OS kernel takes over? Can untrusted code be executed by something like a BIOS module (vgabios.bin and the like) and mess with that memory area? A grub module?
One other complication is the memory area that EDK2 requires for exchanging of data ('that code' for example) between ACPI and SMM. It's hard coded to 0xFFFF 0000. However, with SeaBIOS I cannot use this memory and there's this comment here: 'src/fw/shadow.c:// On the emulators, the bios at 0xf0000 is also at 0xffff0000'.
So the point is SMM is needed for UEFI. QEMU would need to provide the ACPI code for it, which is basically a translation of the ACPI from EDK2 so that this could work. To support SeaBIOS as well, we would have to be able to distinguish a BIOS from the UEFI on the QEMU level so that we could produce different ACPI (no SMI and different OperationRegion than 0xFFFF 0000 for SeaBIOS), *if* on a system with a BIOS the memory area can be considered to be safe (like that EDK2 variable). Otherwise I am afraid it's better to not support it in SeaBIOS and provide all necessary early TPM 2 operations via user interaction with the menu only.
Comments ?
Stefan
On 02/07/18 14:51, Stefan Berger wrote:
On 01/10/2018 08:22 AM, Laszlo Ersek wrote:
Stefan,
On 01/09/18 20:02, Stefan Berger wrote:
Another twist is that Intel's EDK2 also implements this but the data structure layout is different and they use SMM + SMIs etc.
https://github.com/tianocore/edk2/blob/master/SecurityPkg/Tcg/Tcg2Smm/Tpm.as...
As I described in my investigation linked from https://bugzilla.tianocore.org/show_bug.cgi?id=594#c5, we should not include the Tcg2Smm driver in OVMF, for TPM enablement -- at least for the short & mid terms.
What does the Tcg2Smm driver do? In section (2f), I described that the driver installs two tables, "TPM2" and an "SSDT".
- The TPM2 table from this driver is unneeded, since QEMU generates its
own TPM2 table, which describes the TPM device's access method -- TIS+Cancel (method 6).
- The SSDT from the driver is again unneeded. It provides (via the _DSM
method) an ACPI-level API that the OS can use, for talking to the TPM device. An implementation detail of this ACPI method is that it raises an SMI, for entering the firmware at an elevated privilege level (= in SMM). Then, the actual TPM hardware manipulation, or even the TPM *software emulation*, is performed by the firmware, in SMM.
This approach is totally ill-suited for the QEMU virtualization stack. For starters, none of the firmware code exist -- as open source anyway -- that would actually handle such ACPI->SMM requests. Second, I'm sure we don't want to debug TPM software emulation running in SMM guest firmware, rather than an actual QEMU device model.
Once we have a real device model, accessed via IO ports and/or MMIO locations, perhaps in combination with request/response buffers allocated in guest RAM, the SMI/SMM implementation detail falls away completely. Our TPM emulation would attain its "privileged / protected" status simply by existing in the hypervisor (QEMU).
Regarding the SMI/SMM: I think it will be needed for the TPM Physical Presence interface where ACPI gets a code from the user that it sends to the firmware and the firmware acts upon next reboot. SMM stores this code in a UEFI variable (EDK2) to protect it from modules executed by UEFI. I was trying to use a memory area (PPI memory device) for storing this code but it would not give the same protection for UEFI compared to the variable. I suppose the reason is that UEFI can execute (untrusted) code that could manipulate this memory area and cause unwanted changes to the TPM upon reboot by for example writing a code for clearing the TPM. How 'safe' would the BIOS be or any path from the BIOS until the OS kernel takes over? Can untrusted code be executed by something like a BIOS module (vgabios.bin and the like) and mess with that memory area? A grub module?
Yes, this is a correct assessment in my view. SMM provides protection to platform firmware modules not only against the OS, but also against 3rd party firmware components, such as boot loaders, various UEFI applications, UEFI device drivers (loaded from disk or PCI card option ROM BARs), and such.
SMRAM and the writeable pflash chip are two pieces of emulated hardware whose (write) access is restricted to code executing in SMM.
This does not necessarily imply that QEMU should generate SMI-triggering AML methods for the guest. Instead, the sensitive writeable register block of the TPM chip that you added could be restricted to code executing in SMM, similarly to pflash:
-global driver=cfi.pflash01,property=secure,value=on
... On the other hand, I do realize this would take custom SMM code, which is even worse than reusing the SMM variable service code.
Sigh, I *utterly* hate this. I maintain from my earlier email that generating ACPI for this in QEMU is ill-suited. Minimally, the TPM2 table will conflict between edk2 and QEMU. edk2 will be *both* missing code that's going to be necessary for the QEMU/OVMF use case, *and* it will contain code that either conflicts with or is not dynamic enough for the QEMU/OVMF use case.
One other complication is the memory area that EDK2 requires for exchanging of data ('that code' for example) between ACPI and SMM. It's hard coded to 0xFFFF 0000. However, with SeaBIOS I cannot use this memory and there's this comment here: 'src/fw/shadow.c:// On the emulators, the bios at 0xf0000 is also at 0xffff0000'.
So the point is SMM is needed for UEFI. QEMU would need to provide the ACPI code for it, which is basically a translation of the ACPI from EDK2 so that this could work.
OK.
To support SeaBIOS as well, we would have to be able to distinguish a BIOS from the UEFI on the QEMU level so that we could produce different ACPI
Yes and no,
(no SMI and different OperationRegion than 0xFFFF 0000 for SeaBIOS),
"yes" with regard to the SMM difference, "no" with regard to the operation region. We have an ACPI linker/loader command that makes the firmware basically just allocate memory, and we have two other ACPI linker/loader commands that (a) patch the allocation address into other ACPI artifacts, (b) return the allocation address to QEMU (for device emulation purposes), if necessary.
*if* on a system with a BIOS the memory area can be considered to be safe (like that EDK2 variable). Otherwise I am afraid it's better to not support it in SeaBIOS and provide all necessary early TPM 2 operations via user interaction with the menu only.
FWIW, to my knowledge, the RH boot loader team is only interested in TPM for UEFI.
Does Windows utilize TPM PPI when booted on a traditional BIOS computer?
Comments ?
I think your analysis is correct, about SMM. Previously I missed that specifically the Physical Presence operations needed protection.
My operating knowledge about the TPM had been that
Components measure stuff into PCRs, and if any untrusted agent messes with those measurements, for example by directly writing to the PCRs, then the TPM will simply not unseal its secrets, hence such tampering is self-defeating for those agents.
While this might be correct (I hope it is correct!), the *PPI* part of TPM appears entirely different. In fact I don't have the slightest idea *why* PPI is lumped together with the TPM.
Can you explain in more detail what the PPI operations are, and why they need protection, from what agents exactly? What is the purported lifecycle of such PPI operations?
Thanks, Laszlo
On 02/07/2018 09:18 AM, Laszlo Ersek wrote:
On 02/07/18 14:51, Stefan Berger wrote:
On 01/10/2018 08:22 AM, Laszlo Ersek wrote:
Stefan,
On 01/09/18 20:02, Stefan Berger wrote:
Another twist is that Intel's EDK2 also implements this but the data structure layout is different and they use SMM + SMIs etc.
https://github.com/tianocore/edk2/blob/master/SecurityPkg/Tcg/Tcg2Smm/Tpm.as...
As I described in my investigation linked from https://bugzilla.tianocore.org/show_bug.cgi?id=594#c5, we should not include the Tcg2Smm driver in OVMF, for TPM enablement -- at least for the short & mid terms.
What does the Tcg2Smm driver do? In section (2f), I described that the driver installs two tables, "TPM2" and an "SSDT".
The TPM2 table from this driver is unneeded, since QEMU generates its own TPM2 table, which describes the TPM device's access method -- TIS+Cancel (method 6).
The SSDT from the driver is again unneeded. It provides (via the _DSM method) an ACPI-level API that the OS can use, for talking to the TPM device. An implementation detail of this ACPI method is that it raises an SMI, for entering the firmware at an elevated privilege level (= in SMM). Then, the actual TPM hardware manipulation, or even the TPM *software emulation*, is performed by the firmware, in SMM.
This approach is totally ill-suited for the QEMU virtualization stack. For starters, none of the firmware code exist -- as open source anyway -- that would actually handle such ACPI->SMM requests. Second, I'm sure we don't want to debug TPM software emulation running in SMM guest firmware, rather than an actual QEMU device model.
Once we have a real device model, accessed via IO ports and/or MMIO locations, perhaps in combination with request/response buffers allocated in guest RAM, the SMI/SMM implementation detail falls away completely. Our TPM emulation would attain its "privileged / protected" status simply by existing in the hypervisor (QEMU).
Regarding the SMI/SMM: I think it will be needed for the TPM Physical Presence interface where ACPI gets a code from the user that it sends to the firmware and the firmware acts upon next reboot. SMM stores this code in a UEFI variable (EDK2) to protect it from modules executed by UEFI. I was trying to use a memory area (PPI memory device) for storing this code but it would not give the same protection for UEFI compared to the variable. I suppose the reason is that UEFI can execute (untrusted) code that could manipulate this memory area and cause unwanted changes to the TPM upon reboot by for example writing a code for clearing the TPM. How 'safe' would the BIOS be or any path from the BIOS until the OS kernel takes over? Can untrusted code be executed by something like a BIOS module (vgabios.bin and the like) and mess with that memory area? A grub module?
Yes, this is a correct assessment in my view. SMM provides protection to platform firmware modules not only against the OS, but also against 3rd party firmware components, such as boot loaders, various UEFI applications, UEFI device drivers (loaded from disk or PCI card option ROM BARs), and such.
SMRAM and the writeable pflash chip are two pieces of emulated hardware whose (write) access is restricted to code executing in SMM.
This does not necessarily imply that QEMU should generate SMI-triggering AML methods for the guest. Instead, the sensitive writeable register block of the TPM chip that you added could be restricted to code executing in SMM, similarly to pflash:
-global driver=cfi.pflash01,property=secure,value=on
... On the other hand, I do realize this would take custom SMM code, which is even worse than reusing the SMM variable service code.
Sigh, I *utterly* hate this. I maintain from my earlier email that generating ACPI for this in QEMU is ill-suited. Minimally, the TPM2 table will conflict between edk2 and QEMU. edk2 will be *both* missing code that's going to be necessary for the QEMU/OVMF use case, *and* it will contain code that either conflicts with or is not dynamic enough for the QEMU/OVMF use case.
One other complication is the memory area that EDK2 requires for exchanging of data ('that code' for example) between ACPI and SMM. It's hard coded to 0xFFFF 0000. However, with SeaBIOS I cannot use this memory and there's this comment here: 'src/fw/shadow.c:// On the emulators, the bios at 0xf0000 is also at 0xffff0000'.
So the point is SMM is needed for UEFI. QEMU would need to provide the ACPI code for it, which is basically a translation of the ACPI from EDK2 so that this could work.
OK.
To support SeaBIOS as well, we would have to be able to distinguish a BIOS from the UEFI on the QEMU level so that we could produce different ACPI
Yes and no,
(no SMI and different OperationRegion than 0xFFFF 0000 for SeaBIOS),
"yes" with regard to the SMM difference, "no" with regard to the operation region. We have an ACPI linker/loader command that makes the firmware basically just allocate memory, and we have two other ACPI linker/loader commands that (a) patch the allocation address into other ACPI artifacts, (b) return the allocation address to QEMU (for device emulation purposes), if necessary.
I thought about allowing the firmware to configure the memory region to use for the PPI interface. UEFI would say 0xFFFF 0000, SeaBIOS would choose some other area (0xFEF4 5000). Does the ACPI patcher handle this case or does the address patching have to be set up while building the tables in QEMU? If latter, then we would have to know in QEMU whether it's going to be BIOS or UEFI as firmware. I have tried a lot of things in the recent past, but I forgot whether this type of patching is possible.
*if* on a system with a BIOS the memory area can be considered to be safe (like that EDK2 variable). Otherwise I am afraid it's better to not support it in SeaBIOS and provide all necessary early TPM 2 operations via user interaction with the menu only.
FWIW, to my knowledge, the RH boot loader team is only interested in TPM for UEFI.
Does Windows utilize TPM PPI when booted on a traditional BIOS computer?
Not sure. Linux for sure doesn't care whether there's a BIOS or UEFI running underneath.
Comments ?
I think your analysis is correct, about SMM. Previously I missed that specifically the Physical Presence operations needed protection.
My operating knowledge about the TPM had been that
Components measure stuff into PCRs, and if any untrusted agent messes with those measurements, for example by directly writing to the PCRs, then the TPM will simply not unseal its secrets, hence such tampering is self-defeating for those agents.
While this might be correct (I hope it is correct!), the *PPI* part of TPM appears entirely different. In fact I don't have the slightest idea *why* PPI is lumped together with the TPM.
The physical presence interface allows *automation of TPM operations and changing the TPM's state* (such as clearing all keys) that are typically only possible via interaction with the TPM menu in the firmware. Think of it as some TPM operations that can only run successfully while the system runs the firmware. Once the firmware has given control to the next stage (bootloader, kernel) these operations are not possible anymore since the firmware has execute some TPM commands that put the TPM into a state so it wouldn't allow those operations anymore.
Can you explain in more detail what the PPI operations are, and why they need protection, from what agents exactly? What is the purported lifecycle of such PPI operations?
With the clearing of the TPM one would loose all keys associated with the TPM. So you don't want some software module to be able to set such a 'code', reset the machine, and the user looses all keys on the way. The control has to be strongly with the admin. Also, to prevent fumbling with the variables, UEFI seems to make the variable read-only.
I am wondering whether a malicious UEFI module could be written that patches the ACPI tables and does what it wants when it comes to these early TPM operations, rather than what the admin wants. Its ACPI code would just enter SMM and instruct to clear the TPM upon reboot whenever invoked. If that's possible then we may have 'only' moved the problem from the secured UEFI variable to patching ACPI code, which is more work of course. (We need signed ACPI tables...)
Stefan
Thanks, Laszlo
On Wed, 7 Feb 2018 08:51:58 -0500 Stefan Berger stefanb@linux.vnet.ibm.com wrote:
On 01/10/2018 08:22 AM, Laszlo Ersek wrote:
Stefan,
On 01/09/18 20:02, Stefan Berger wrote:
[...]
So the point is SMM is needed for UEFI. QEMU would need to provide the ACPI code for it, which is basically a translation of the ACPI from EDK2 so that this could work. To support SeaBIOS as well, we would have to be able to distinguish a BIOS from the UEFI on the QEMU level so that we could produce different ACPI (no SMI and different OperationRegion than 0xFFFF 0000 for SeaBIOS), *if* on a system with a BIOS the memory area can be considered to be safe (like that EDK2 variable).
Does KVM actually restrict access to SMM memory (implements SMRR MSRs)?
And even with SMRR, memory might be exposed to another cpu on cpu hotplug in current hotplug impl. if malicious code wins SIPI race in bringing up hotplugged CPU from (unprotected) reset state.
Otherwise I am afraid it's better to not support it in SeaBIOS and provide all necessary early TPM 2 operations via user interaction with the menu only.
Comments ?
Stefan
On 02/07/18 15:57, Igor Mammedov wrote:
On Wed, 7 Feb 2018 08:51:58 -0500 Stefan Berger stefanb@linux.vnet.ibm.com wrote:
On 01/10/2018 08:22 AM, Laszlo Ersek wrote:
Stefan,
On 01/09/18 20:02, Stefan Berger wrote:
[...]
So the point is SMM is needed for UEFI. QEMU would need to provide the ACPI code for it, which is basically a translation of the ACPI from EDK2 so that this could work. To support SeaBIOS as well, we would have to be able to distinguish a BIOS from the UEFI on the QEMU level so that we could produce different ACPI (no SMI and different OperationRegion than 0xFFFF 0000 for SeaBIOS), *if* on a system with a BIOS the memory area can be considered to be safe (like that EDK2 variable).
Does KVM actually restrict access to SMM memory (implements SMRR MSRs)?
KVM does not implement SMRRs, but QEMU+KVM implement SMRAM. OVMF exposes the Q35 TSEG region as SMRAM to the edk2 machinery. TSEG is controlled through various chipset registers.
Paolo's presentation and slides from 2015:
https://www.youtube.com/watch?v=IxLvxP1O8T8
And even with SMRR, memory might be exposed to another cpu on cpu hotplug in current hotplug impl. if malicious code wins SIPI race in bringing up hotplugged CPU from (unprotected) reset state.
Yes, VCPU hotplug isn't even expected to work with SMM at the moment. "Don't do that just yet."
https://bugzilla.redhat.com/show_bug.cgi?id=1454803
Thanks Laszlo
On 02/07/18 15:57, Stefan Berger wrote:
On 02/07/2018 09:18 AM, Laszlo Ersek wrote:
On 02/07/18 14:51, Stefan Berger wrote:
To support SeaBIOS as well, we would have to be able to distinguish a BIOS from the UEFI on the QEMU level so that we could produce different ACPI
Yes and no,
(no SMI and different OperationRegion than 0xFFFF 0000 for SeaBIOS),
"yes" with regard to the SMM difference, "no" with regard to the operation region. We have an ACPI linker/loader command that makes the firmware basically just allocate memory, and we have two other ACPI linker/loader commands that (a) patch the allocation address into other ACPI artifacts, (b) return the allocation address to QEMU (for device emulation purposes), if necessary.
I thought about allowing the firmware to configure the memory region to use for the PPI interface. UEFI would say 0xFFFF 0000, SeaBIOS would choose some other area (0xFEF4 5000). Does the ACPI patcher handle this case or does the address patching have to be set up while building the tables in QEMU? If latter, then we would have to know in QEMU whether it's going to be BIOS or UEFI as firmware. I have tried a lot of things in the recent past, but I forgot whether this type of patching is possible.
The ACPI linker/loader commands are typically added to the "linker script" in the very functions that build the ACPI payload.
And, distinguishing the firmwares is not necessary just for this; the point of the firmware-side allocation is that QEMU does not dictate the address. Each firmware is expected to use its own memory allocation service, which in turn will ensure that the runtime OS stays away from the allocated area. So the allocation address is ultimately determined by the firmware.
The other two commands make the firmware patch the actual allocation address (whatever it may be) into other ACPI artifacts, and make the firmware pass the allocation address (whatever it may be) back to QEMU.
My operating knowledge about the TPM had been that
Components measure stuff into PCRs, and if any untrusted agent messes with those measurements, for example by directly writing to the PCRs, then the TPM will simply not unseal its secrets, hence such tampering is self-defeating for those agents.
While this might be correct (I hope it is correct!), the *PPI* part of TPM appears entirely different. In fact I don't have the slightest idea *why* PPI is lumped together with the TPM.
The physical presence interface allows *automation of TPM operations and changing the TPM's state* (such as clearing all keys) that are typically only possible via interaction with the TPM menu in the firmware. Think of it as some TPM operations that can only run successfully while the system runs the firmware. Once the firmware has given control to the next stage (bootloader, kernel) these operations are not possible anymore since the firmware has execute some TPM commands that put the TPM into a state so it wouldn't allow those operations anymore.
OK, but if the OS is allowed to modify this set of "queued operations", then what protection is expected of SMM? Whether you can modify the TPM directly, or queue random commands for it at libery, what's the difference?
Can you explain in more detail what the PPI operations are, and why they need protection, from what agents exactly? What is the purported lifecycle of such PPI operations?
With the clearing of the TPM one would loose all keys associated with the TPM. So you don't want some software module to be able to set such a 'code', reset the machine, and the user looses all keys on the way. The control has to be strongly with the admin.
Where is this barrier erected, between OS and firmware, or between privileged and non-privileged OS user?
SMM is only relevant if the barrier is expected between OS and firmware; i.e. you want to constrain the OS kernel to a subset of valid operations. If the barrier is between privileged and non-privileged OS user, then the implementation belongs in the OS kernel, since mere users don't have direct hardware access anyway.
Also, to prevent fumbling with the variables, UEFI seems to make the variable read-only.
That seems to imply the barrier is between OS kernel and firmware.
I am wondering whether a malicious UEFI module could be written that patches the ACPI tables and does what it wants when it comes to these early TPM operations, rather than what the admin wants.
This is a good point, and it applies to more than just ACPI. The answer is that it doesn't matter what *any* OS level code does -- as long as the barrier is expected between OS and firmware --, because the SMM code in the firmware must perform *complete* validation / verification of the request.
Another example is the UEFI runtime variable services. In the SMM_REQUIRE build of OVMF, those services are split to two privilege levels, "runtime DXE driver" and "SMM driver". The runtime DXE driver layer provides the OS with the UEFI interface, but internally it only formats a request buffer (serializes the variable operation), and raises an SMI. Once in SMM, the "SMM driver" layer de-serializes and verifies the request, and performs it if it's valid. If the OS messes with the "runtime DXE driver" half of the service (because it can -- that layer lives in simple system memory), the worst the OS can do is submit a crafted request buffer to the SMM half. The SMM half in turn *always* has to evaluate the request buffer as if it came from a malicious agent (an attacker). In other words, the "runtime DXE driver" half is just a convenience for the OS, for preparing a well-formed request buffer. (Which can still be rejected, of course, if the request doesn't pass higher-level authentication and such).
The same applies to your example. The queued PPI operations must be entirely validated in SMM; the ACPI code for formatting / submitting them is just a convenience. If the trust is based in the ACPI code, then the security model is busted. This is why I ask above, 'if the OS is allowed to modify this set of "queued operations", then what protection is expected of SMM?'
Again, we have to see where the barrier is, between OS and firmware, or between OS-level users:
- In both cases, 3rd party UEFI apps / driver are considered equally privileged to the OS kernel;
- in the OS<->firmware barrier case, SMM is required, and UEFI apps and the OS kernel are similarly restricted to submitting requests to SMM, and all the business verification belongs in SMM,
- in the "barrier between OS-level users" case, SMM is not needed; UEFI apps and the OS kernel are equally allowed to access hardware directly, and non-privileged users are restricted by the OS kernel only.
Its ACPI code would just enter SMM and instruct to clear the TPM upon reboot whenever invoked. If that's possible then we may have 'only' moved the problem from the secured UEFI variable to patching ACPI code, which is more work of course. (We need signed ACPI tables...)
Right; if you want to prevent UEFI apps (equivalently, the OS kernel) from queueing such a "zap TPM" operation, then SMM is required, *and* the code running in SMM needs *some* mechanism to authenticate the request (beyond checking for well-formedness).
For example, regarding UEFI variables that are related to Secure Boot, variable update requests (from the OS or 3rd party UEFI apps/drivers) are verified (in SMM) by checking digital signatures on those requests.
... Judged purely from the *name* of the feature, "Physical Presence Interface", I think the idea is that a physically present user is allowed to issue / queue a "zap TPM" request. The question then becomes how you define "physically present".
OVMF currently equates all UEFI-level code with a user being physically present; see the UserPhysicalPresent() implementation in "OvmfPkg/Library/PlatformSecureLib/PlatformSecureLib.c". (The PlatformSecureLib class "Provides a platform-specific method to enable Secure Boot Custom Mode setup", which is not precisely your use case, but similarly privileged.) I wouldn't know how to define physical presence otherwise, in virtual firmware. If you can figure out a way to deduce physical presence in the guest *OS*, then we can say the barrier is between OS-level users, not between OS and firmware, and then SMM is not needed.
What do we want to *use* PPI for? What agents should *not* be allowed to queue a "zap TPM" operation?
(Sorry if this email is too long and confusing! I'm confused.)
Thanks Laszlo
On 02/07/2018 10:50 AM, Laszlo Ersek wrote:
On 02/07/18 15:57, Stefan Berger wrote:
On 02/07/2018 09:18 AM, Laszlo Ersek wrote:
On 02/07/18 14:51, Stefan Berger wrote:
To support SeaBIOS as well, we would have to be able to distinguish a BIOS from the UEFI on the QEMU level so that we could produce different ACPI
Yes and no,
(no SMI and different OperationRegion than 0xFFFF 0000 for SeaBIOS),
"yes" with regard to the SMM difference, "no" with regard to the operation region. We have an ACPI linker/loader command that makes the firmware basically just allocate memory, and we have two other ACPI linker/loader commands that (a) patch the allocation address into other ACPI artifacts, (b) return the allocation address to QEMU (for device emulation purposes), if necessary.
I thought about allowing the firmware to configure the memory region to use for the PPI interface. UEFI would say 0xFFFF 0000, SeaBIOS would choose some other area (0xFEF4 5000). Does the ACPI patcher handle this case or does the address patching have to be set up while building the tables in QEMU? If latter, then we would have to know in QEMU whether it's going to be BIOS or UEFI as firmware. I have tried a lot of things in the recent past, but I forgot whether this type of patching is possible.
The ACPI linker/loader commands are typically added to the "linker script" in the very functions that build the ACPI payload.
And, distinguishing the firmwares is not necessary just for this; the point of the firmware-side allocation is that QEMU does not dictate the address. Each firmware is expected to use its own memory allocation service, which in turn will ensure that the runtime OS stays away from the allocated area. So the allocation address is ultimately determined by the firmware.
The other two commands make the firmware patch the actual allocation address (whatever it may be) into other ACPI artifacts, and make the firmware pass the allocation address (whatever it may be) back to QEMU.
My operating knowledge about the TPM had been that
Components measure stuff into PCRs, and if any untrusted agent messes with those measurements, for example by directly writing to the PCRs, then the TPM will simply not unseal its secrets, hence such tampering is self-defeating for those agents.
While this might be correct (I hope it is correct!), the *PPI* part of TPM appears entirely different. In fact I don't have the slightest idea *why* PPI is lumped together with the TPM.
The physical presence interface allows *automation of TPM operations and changing the TPM's state* (such as clearing all keys) that are typically only possible via interaction with the TPM menu in the firmware. Think of it as some TPM operations that can only run successfully while the system runs the firmware. Once the firmware has given control to the next stage (bootloader, kernel) these operations are not possible anymore since the firmware has execute some TPM commands that put the TPM into a state so it wouldn't allow those operations anymore.
OK, but if the OS is allowed to modify this set of "queued operations", then what protection is expected of SMM? Whether you can modify the TPM directly, or queue random commands for it at libery, what's the difference?
On the OS level it is presumably an operation that is reserved to the admin to queue the operation.
I am not that familiar with UEFI and who is allowed to run code there and what code it can execute. But UEFI seems to lock the variable that holds that PPI code that tells it what to do after next reboot. So presumably a UEFI module cannot modify that variable but can only read it (and hopefully not manipulate NVRAM directly). If PPI was implemented through a memory location where the code gets written to it could do that likely easily (unless memory protections are setup by UEFI, which I don't know), cause a reset and have UEFI execute on that code.
Can you explain in more detail what the PPI operations are, and why they need protection, from what agents exactly? What is the purported lifecycle of such PPI operations?
With the clearing of the TPM one would loose all keys associated with the TPM. So you don't want some software module to be able to set such a 'code', reset the machine, and the user looses all keys on the way. The control has to be strongly with the admin.
Where is this barrier erected, between OS and firmware, or between privileged and non-privileged OS user?
Between OS and firmware.
SMM is only relevant if the barrier is expected between OS and firmware; i.e. you want to constrain the OS kernel to a subset of valid operations. If the barrier is between privileged and non-privileged OS user, then the implementation belongs in the OS kernel, since mere users don't have direct hardware access anyway.
Also, to prevent fumbling with the variables, UEFI seems to make the variable read-only.
That seems to imply the barrier is between OS kernel and firmware.
It is.
I am wondering whether a malicious UEFI module could be written that patches the ACPI tables and does what it wants when it comes to these early TPM operations, rather than what the admin wants.
This is a good point, and it applies to more than just ACPI. The answer is that it doesn't matter what *any* OS level code does -- as long as the barrier is expected between OS and firmware --, because the SMM code in the firmware must perform *complete* validation / verification of the request.
Another example is the UEFI runtime variable services. In the SMM_REQUIRE build of OVMF, those services are split to two privilege levels, "runtime DXE driver" and "SMM driver". The runtime DXE driver layer provides the OS with the UEFI interface, but internally it only formats a request buffer (serializes the variable operation), and raises an SMI. Once in SMM, the "SMM driver" layer de-serializes and verifies the request, and performs it if it's valid. If the OS messes with the "runtime DXE driver" half of the service (because it can -- that layer lives in simple system memory), the worst the OS can do is submit a crafted request buffer to the SMM half. The SMM half in turn *always* has to evaluate the request buffer as if it came from a malicious agent (an attacker). In other words, the "runtime DXE driver" half is just a convenience for the OS, for preparing a well-formed request buffer. (Which can still be rejected, of course, if the request doesn't pass higher-level authentication and such).
The same applies to your example. The queued PPI operations must be entirely validated in SMM; the ACPI code for formatting / submitting them is just a convenience. If the trust is based in the ACPI code, then the security model is busted. This is why I ask above, 'if the OS is allowed to modify this set of "queued operations", then what protection is expected of SMM?'
The standard implies that ACPI is used for passing the parameters from OS to the firmware:
https://trustedcomputinggroup.org/tcg-physical-presence-interface-specificat...
Again, we have to see where the barrier is, between OS and firmware, or between OS-level users:
In both cases, 3rd party UEFI apps / driver are considered equally privileged to the OS kernel;
in the OS<->firmware barrier case, SMM is required, and UEFI apps and the OS kernel are similarly restricted to submitting requests to SMM, and all the business verification belongs in SMM,
So SMM can verify whether the parameters it gets are valid. Whether now the user wanted to set operation 0 but the ACPI code submitted 5 (Clear TPM), would be a matter of verifying the ACPI code that's in-between. Is an attack via ACPI manipulation through some UEFI module possible?
- in the "barrier between OS-level users" case, SMM is not needed; UEFI apps and the OS kernel are equally allowed to access hardware directly, and non-privileged users are restricted by the OS kernel only.
Its ACPI code would just enter SMM and instruct to clear the TPM upon reboot whenever invoked. If that's possible then we may have 'only' moved the problem from the secured UEFI variable to patching ACPI code, which is more work of course. (We need signed ACPI tables...)
Right; if you want to prevent UEFI apps (equivalently, the OS kernel) from queueing such a "zap TPM" operation, then SMM is required, *and* the code running in SMM needs *some* mechanism to authenticate the request (beyond checking for well-formedness).
For example, regarding UEFI variables that are related to Secure Boot, variable update requests (from the OS or 3rd party UEFI apps/drivers) are verified (in SMM) by checking digital signatures on those requests.
... Judged purely from the *name* of the feature, "Physical Presence Interface", I think the idea is that a physically present user is allowed to issue / queue a "zap TPM" request. The question then becomes how you define "physically present".
There are operations in the PPI that set and clear flags that either enable prompting or disable prompting for the execution of a certain TPM operation by the firmware. For full automation and eliminating user interaction entirely one would issue the code(s) to disable the prompts (that's what PHYSICAL_PRESENCE_FLAGS_VARIABLE in EDK2 is for), zap the TPM, then possibly re-enable the prompts. A couple of reboots are necessary.
FYI: The 'physical presence' meant that a user had to be present at the physical keyboard when at a certain stage after machine start and then, when entering the firmware, could run certain TPM operations that are then enabled but also only possibly when in the firmware. When doing this via a remote screen, the firmware may not support running these commands since the physical keyboard or some other button wasn't touched. Now with the PPI automation holding a key or button may not be necessary anymore.
OVMF currently equates all UEFI-level code with a user being physically present; see the UserPhysicalPresent() implementation in "OvmfPkg/Library/PlatformSecureLib/PlatformSecureLib.c". (The PlatformSecureLib class "Provides a platform-specific method to enable Secure Boot Custom Mode setup", which is not precisely your use case, but similarly privileged.) I wouldn't know how to define physical presence otherwise, in virtual firmware. If you can figure out a way to deduce physical presence in the guest *OS*, then we can say the barrier is between OS-level users, not between OS and firmware, and then SMM is not needed.
What do we want to *use* PPI for? What agents should *not* be allowed to queue a "zap TPM" operation?
On the OS level it must remain a privileged operation of an admin to issue these PPI codes. That it is a privileged operation is implemented by the OS and I don't think we need to do anything. What we would want to prevent is abuse by a module that the firmware executes for example. I think this is the driving force for a UEFI variable and the fact that it's being locked (and later on unlocked so SMM mode can write to it ?)
As for the use case, I would say it's automation on the OS level. From that perspective it's support could probably be deferred, which may eliminate at least the SMM part. However, UEFI uses the PPI mechanisms itself to issue certain commands when interacting with its menu. I am not sure whether SMM code is involved here... but for being able to use UEFI and TPM 2 at least for the UEFI support the PPI part needs to be there, otherwise the menu items one gets won't do anything. [The question is does UEFI execute ACPI or write directly in the UEFI varaible? My guess is the latter.]
(Sorry if this email is too long and confusing! I'm confused.)
Me too. I am not clear on specifics in UEFI, such as memory protections setup while a module is running in UEFI. Is NVRAM protected from overwrite? Who can run a module in UEFI? Does it need to be signed ?
Thanks Laszlo
Stefan
On 02/07/18 17:44, Stefan Berger wrote:
On 02/07/2018 10:50 AM, Laszlo Ersek wrote:
OK, but if the OS is allowed to modify this set of "queued operations", then what protection is expected of SMM? Whether you can modify the TPM directly, or queue random commands for it at libery, what's the difference?
On the OS level it is presumably an operation that is reserved to the admin to queue the operation.
I am not that familiar with UEFI and who is allowed to run code there and what code it can execute. But UEFI seems to lock the variable that holds that PPI code that tells it what to do after next reboot. So presumably a UEFI module cannot modify that variable but can only read it (and hopefully not manipulate NVRAM directly). If PPI was implemented through a memory location where the code gets written to it could do that likely easily (unless memory protections are setup by UEFI, which I don't know), cause a reset and have UEFI execute on that code.
This makes sense... but then it doesn't make sense :)
Assume that the variable is indeed "locked" (so that random UEFI drivers / apps cannot rewrite it using the UEFI variable service). Then,
- if the lock is enforced in SMM, then the variable will be locked from the OS as well, not just from 3rd party UEFI apps, so no PPI operations can ever be queued,
- if the lock is "simulated" in ACPI or in non-SMM firmare code (= in the "runtime DXE driver" layer), then the lock can be circumvented by both 3rd party UEFI apps and the OS.
Again, we have to see where the barrier is, between OS and firmware, or between OS-level users:
- In both cases, 3rd party UEFI apps / driver are considered equally
privileged to the OS kernel;
- in the OS<->firmware barrier case, SMM is required, and UEFI apps and
the OS kernel are similarly restricted to submitting requests to SMM, and all the business verification belongs in SMM,
So SMM can verify whether the parameters it gets are valid. Whether now the user wanted to set operation 0 but the ACPI code submitted 5 (Clear TPM), would be a matter of verifying the ACPI code that's in-between. Is an attack via ACPI manipulation through some UEFI module possible?
Yes, it is possible.
There are dedicated UEFI (and PI -- "platform init") services for installing new ACPI tables, and even for locating and parsing -- albeit in a *very* cumbersome way -- existing ACPI tables (AML too). Once the right ACPI objects are found, they can be overwritten.
On the OS level it must remain a privileged operation of an admin to issue these PPI codes. That it is a privileged operation is implemented by the OS and I don't think we need to do anything. What we would want to prevent is abuse by a module that the firmware executes for example. I think this is the driving force for a UEFI variable and the fact that it's being locked (and later on unlocked so SMM mode can write to it ?)
This unlocking intrigues me. Assuming it happens in SMM, I have no idea how the implementation tells apart the requestors (3rd party UEFI app vs. OS).
As for the use case, I would say it's automation on the OS level. From that perspective it's support could probably be deferred, which may eliminate at least the SMM part. However, UEFI uses the PPI mechanisms itself to issue certain commands when interacting with its menu. I am not sure whether SMM code is involved here... but for being able to use UEFI and TPM 2 at least for the UEFI support the PPI part needs to be there, otherwise the menu items one gets won't do anything. [The question is does UEFI execute ACPI or write directly in the UEFI varaible? My guess is the latter.]
I'm sorry, I'm out of my depth here. Can we re-have this discussion on edk2-devel? (A bit later though, please, because currently I'm unable to send email to edk2-devel. The 01.org list server recently dislikes something about my emails and keeps rejecting them.)
(Sorry if this email is too long and confusing! I'm confused.)
Me too. I am not clear on specifics in UEFI, such as memory protections setup while a module is running in UEFI. Is NVRAM protected from overwrite?
Only SMRAM and pflash (aka NVRAM aka UEFI variables, on QEMU anyway) are protected from direct hardware write.
Whether the write to SMRAM/pflash hardware comes from the OS or a 3rd party UEFI app is irrelevent, both are prevented; only code running in SMM is permitted write access.
Furthermore, it is irrelevant whether the OS or a 3rd party UEFI app is the one that submits a request into SMM. If the request buffer passes validation, then SMRAM and/or pflash (as appropriate) are updated. This is to say that only the *data* in the request determine success vs. failure; the "origin" of the request is unprovable and means nothing.
Who can run a module in UEFI?
If you have write access to the EFI system partition, or can plug a PCI card in the system, you can run UEFI code (dependent on Secure Boot and signing the UEFI binary).
Does it need to be signed ?
It depends on the SB configuration, but either way, a correct signature does not grant access to SMRAM or pflash. Just because SB allows a binary to be executed, the binary still has to submit a valid request buffer to SMM, for modifying SMRAM or pflash.
Thanks Laszlo
Hi
On Wed, Feb 7, 2018 at 6:21 PM, Laszlo Ersek lersek@redhat.com wrote:
On 02/07/18 17:44, Stefan Berger wrote:
On 02/07/2018 10:50 AM, Laszlo Ersek wrote:
OK, but if the OS is allowed to modify this set of "queued operations", then what protection is expected of SMM? Whether you can modify the TPM directly, or queue random commands for it at libery, what's the difference?
On the OS level it is presumably an operation that is reserved to the admin to queue the operation.
I am not that familiar with UEFI and who is allowed to run code there and what code it can execute. But UEFI seems to lock the variable that holds that PPI code that tells it what to do after next reboot. So presumably a UEFI module cannot modify that variable but can only read it (and hopefully not manipulate NVRAM directly). If PPI was implemented through a memory location where the code gets written to it could do that likely easily (unless memory protections are setup by UEFI, which I don't know), cause a reset and have UEFI execute on that code.
This makes sense... but then it doesn't make sense :)
Assume that the variable is indeed "locked" (so that random UEFI drivers / apps cannot rewrite it using the UEFI variable service). Then,
if the lock is enforced in SMM, then the variable will be locked from the OS as well, not just from 3rd party UEFI apps, so no PPI operations can ever be queued,
if the lock is "simulated" in ACPI or in non-SMM firmare code (= in the "runtime DXE driver" layer), then the lock can be circumvented by both 3rd party UEFI apps and the OS.
Regarding security of PPI pending operations, the spec clearly says "The location for tracking the pending PPI operation, including the tracking of necessary PLATFORM RESET operations, does not need to be a secure or trusted location." (9.9 p.32)
I assume this is because the user has to confirm the pending operation in pre-os console, so if some attacker wanted to clear the TPM, the user would have to confirm it (same for other operation of flags manipulation). That may not be the best security design, but at least, the user could be in control.
How does the UEFI runtime variable services verify the authenticity of the requests? Or does it only check a request validity?
thanks