ipxe contains the following snippet:
/* Copy ROM to image source PMM block */ pushw %es xorw %ax, %ax movw %ax, %es movl %esi, %edi xorl %esi, %esi movzbl romheader_size, %ecx shll $9, %ecx addr32 rep movsb /* PMM presence implies flat real mode */
Which copies an image to %edi, with %edi >= 0x10000. This is in accordance with the PMM spec:
"3.2.4 Accessing Extended Memory
This section specifies how clients should access extended memory blocks allocated by the PMM. When control is passed to an option ROM from a BIOS that supports PMM, the processor will be in big real mode, and Gate A20 will be disabled (segment wrap turned off). This allows access to extended memory blocks using real mode addressing.
In big real mode, access to memory above 1MB can be accomplished by using a 32-bit extended index register (EDI, etc.) and setting the segment register to 0000h. The following code example assumes that the pmmAllocate function was just called to allocate a block of extended memory, and DX:AX returned the 32-bit buffer address.
; Assume here that DX:AX contains the 32-bit address of our allocated buffer. ; Clear the DS segment register. push 0000h pop ds ; Put the DX:AX 32-bit buffer address into EDI. mov di, dx ; Get the upper word. shl edi, 16 ; Shift it to upper EDI. mov di, ax ; Get the lower word. ; Example: clear the first four bytes of the extended memory buffer. mov [edi], 00000000h ; DS:EDI is used as the memory pointer.
In a similar way, the other segment registers and 32-bit index registers can be used for extended memory accessing."
So far so good. But the Intel SDM says (20.1.1):
"The IA-32 processors beginning with the Intel386 processor can generate 32-bit offsets using an address override prefix; however, in real-address mode, the value of a 32-bit offset may not exceed FFFFH without causing an exception. For full compatibility with Intel 286 real-address mode, pseudo-protection faults (interrupt 12 or 13) occur if a 32-bit offset is generated outside the range 0 through FFFFH."
Which is exactly what happens here. My understanding of big real mode is that to achieve a segment limit != 0xffff, you must go into 32-bit protected mode, load a segment with a larger limit, and return into real mode without touching the segment. The next load of the segment will reset the limit to 0xffff.
Due to bugs in both qemu tcg and kvm, limit checks are not enforced in real mode, but once this bugs are fixed, the code above will break.
The PMM spec also has this to say (1.3):
"Big Real Mode
Big Real Mode is a modified version of the processor’s real mode with the segment limits changed from 1MB to 4GB. Big real mode allows the BIOS or an Option ROM to read and write extended memory without the overhead of protected mode. The BIOS puts the processor in big real mode during POST to allow simplified access to extended memory. The processor will be in big real mode while the PMM Services are callable."
This is more in line with the Intel spec, and means that the modification to %es must be avoided (and that seabios needs changes to either work in big real mode, or to put the processor back into big real mode after returning from a PMM service.
The whole thing is very unfortunate, as kvm is very slow while in big real mode, on certain processors.
On Sunday 19 Aug 2012 16:07:05 Avi Kivity wrote:
Which is exactly what happens here. My understanding of big real mode is that to achieve a segment limit != 0xffff, you must go into 32-bit protected mode, load a segment with a larger limit, and return into real mode without touching the segment. The next load of the segment will reset the limit to 0xffff.
Not quite. You can't "return into real mode without touching the segment", since part of the process of returning to real mode is to reload the segment registers with real-mode values, and this happens _after_ setting CR0.PE=0.
Whenever CR0.PE=0, loading a segment register with value N will load the literal value (N<<4) into the base address for that segment, without changing the limit. This is the trick that allows flat real mode (aka big real mode) to work; the limit remains at 4G even after loading the segment register with a real-mode value.
(and that seabios needs changes to either work in big real mode, or to put the processor back into big real mode after returning from a PMM service.
If seabios switches into protected mode when performing a PMM service, then it _must_ leave the segment limits at 4G when returning to real mode. To do otherwise will violate the PMM spec, and will break conforming clients such as iPXE.
Michael
On 08/19/2012 06:34 PM, Michael Brown wrote:
On Sunday 19 Aug 2012 16:07:05 Avi Kivity wrote:
Which is exactly what happens here. My understanding of big real mode is that to achieve a segment limit != 0xffff, you must go into 32-bit protected mode, load a segment with a larger limit, and return into real mode without touching the segment. The next load of the segment will reset the limit to 0xffff.
Not quite. You can't "return into real mode without touching the segment", since part of the process of returning to real mode is to reload the segment registers with real-mode values, and this happens _after_ setting CR0.PE=0.
Whenever CR0.PE=0, loading a segment register with value N will load the literal value (N<<4) into the base address for that segment, without changing the limit. This is the trick that allows flat real mode (aka big real mode) to work; the limit remains at 4G even after loading the segment register with a real-mode value.
So I see, from looking at the Xen source. I'll also double-check with bochs. Looks like I'll need to fix kvm not to reset the segment limit when reloading a segment in real mode.
(and that seabios needs changes to either work in big real mode, or to put the processor back into big real mode after returning from a PMM service.
If seabios switches into protected mode when performing a PMM service, then it _must_ leave the segment limits at 4G when returning to real mode. To do otherwise will violate the PMM spec, and will break conforming clients such as iPXE.
This probably works, since iPXE works on kvm on AMD and on Intel processors with "unrestricted guest" support.
On Sun, Aug 19, 2012 at 04:34:50PM +0100, Michael Brown wrote:
On Sunday 19 Aug 2012 16:07:05 Avi Kivity wrote:
(and that seabios needs changes to either work in big real mode, or to put the processor back into big real mode after returning from a PMM service.
If seabios switches into protected mode when performing a PMM service, then it _must_ leave the segment limits at 4G when returning to real mode. To do otherwise will violate the PMM spec, and will break conforming clients such as iPXE.
SeaBIOS does switch to 32bit mode during PMM calls and does switch to 16bit "big real" mode (segment limits set to 4G) on return.
-Kevin
On Sun, Aug 19, 2012 at 06:07:05PM +0300, Avi Kivity wrote:
ipxe contains the following snippet:
/* Copy ROM to image source PMM block */ pushw %es xorw %ax, %ax movw %ax, %es movl %esi, %edi xorl %esi, %esi movzbl romheader_size, %ecx shll $9, %ecx addr32 rep movsb /* PMM presence implies flat real mode */
Which copies an image to %edi, with %edi >= 0x10000. This is in accordance with the PMM spec:
[...]
So far so good. But the Intel SDM says (20.1.1):
"The IA-32 processors beginning with the Intel386 processor can generate 32-bit offsets using an address override prefix; however, in real-address mode, the value of a 32-bit offset may not exceed FFFFH without causing an exception. For full compatibility with Intel 286 real-address mode, pseudo-protection faults (interrupt 12 or 13) occur if a 32-bit offset is generated outside the range 0 through FFFFH."
I interpretted the above to mean "however, in [normal real-mode where the segment registers are set to 0xffff] real-address mode, the value of a 32-bit offset may not exceed FFFFH without causing an exception"
Which is exactly what happens here. My understanding of big real mode is that to achieve a segment limit != 0xffff, you must go into 32-bit protected mode, load a segment with a larger limit, and return into real mode without touching the segment. The next load of the segment will reset the limit to 0xffff.
No, the segment limit is only changed when the protected mode bit is set and the segment register is loaded. When the protected mode bit is not set, only the segment offset changes.
[...]
The PMM spec also has this to say (1.3):
"Big Real Mode
Big Real Mode is a modified version of the processor’s real mode with the segment limits changed from 1MB to 4GB. Big real mode allows the BIOS or an Option ROM to read and write extended memory without the overhead of protected mode. The BIOS puts the processor in big real mode during POST to allow simplified access to extended memory. The processor will be in big real mode while the PMM Services are callable."
This is more in line with the Intel spec, and means that the modification to %es must be avoided (and that seabios needs changes to either work in big real mode, or to put the processor back into big real mode after returning from a PMM service.
The SeaBIOS code is regularly used on a variety of real processors (which do enforce segment limits). This includes several different AMD processors and Intel processors. It has also been tested in the past with other manufacturers (eg, Via). We've never seen an issue with the "big real mode" support.
The whole thing is very unfortunate, as kvm is very slow while in big real mode, on certain processors.
Unfortunately, "big real mode" is a requirement for option roms.
-Kevin
On 08/19/2012 06:44 PM, Kevin O'Connor wrote:
On Sun, Aug 19, 2012 at 06:07:05PM +0300, Avi Kivity wrote:
ipxe contains the following snippet:
/* Copy ROM to image source PMM block */ pushw %es xorw %ax, %ax movw %ax, %es movl %esi, %edi xorl %esi, %esi movzbl romheader_size, %ecx shll $9, %ecx addr32 rep movsb /* PMM presence implies flat real mode */
Which copies an image to %edi, with %edi >= 0x10000. This is in accordance with the PMM spec:
[...]
So far so good. But the Intel SDM says (20.1.1):
"The IA-32 processors beginning with the Intel386 processor can generate 32-bit offsets using an address override prefix; however, in real-address mode, the value of a 32-bit offset may not exceed FFFFH without causing an exception. For full compatibility with Intel 286 real-address mode, pseudo-protection faults (interrupt 12 or 13) occur if a 32-bit offset is generated outside the range 0 through FFFFH."
I interpretted the above to mean "however, in [normal real-mode where the segment registers are set to 0xffff] real-address mode, the value of a 32-bit offset may not exceed FFFFH without causing an exception"
I understood it the same way.
Which is exactly what happens here. My understanding of big real mode is that to achieve a segment limit != 0xffff, you must go into 32-bit protected mode, load a segment with a larger limit, and return into real mode without touching the segment. The next load of the segment will reset the limit to 0xffff.
No, the segment limit is only changed when the protected mode bit is set and the segment register is loaded. When the protected mode bit is not set, only the segment offset changes.
That's what I missed. I always understood a segment reload in real mode to reset the limit field, though I had no basis for it. I'll fix kvm not to do this.