First of all, sorry for the very late reply, however, I thought I really ought to offer my perspective on this:
On 11/25/2010 06:24 AM, Kevin O'Connor wrote:
It's difficult to have a uniform view of the stack when transition modes, so pass the return address in a register. As a result, the transition functions only access memory via the %cs selector now.
I think this assertion is rather unfortunate, because my own experience with thunking is that it is actually a very useful thing to have access to the real-mode stack.
This is simply accomplished by computing a 32-bit register containing the value (ss << 4) + sp, for example:
movzwl %sp, %eax movl %ss, %ecx shrl $4, %ecx addl %ecx, %eax
This is particularly handy if there is a push/pop of the 16-bit register set in the entry/exit sequence. Furthermore, pushing the target address onto the stack rather than stuffing it into a register allows a 32-bit routine to have full access to the 16-bit register image, whereas burning a register means that that register is going to have to be handled differently.
In Syslinux I have this formalized so that the sequence:
pushl $func32 callw _pm_call
... turns into the C function call:
void func32(com32sys_t *regs)
... where com32sys_t is a structure which contains the 16-bit register image:
typedef struct { uint16_t gs; /* Offset 0 */ uint16_t fs; /* Offset 2 */ uint16_t es; /* Offset 4 */ uint16_t ds; /* Offset 6 */
reg32_t edi; /* Offset 8 */ reg32_t esi; /* Offset 12 */ reg32_t ebp; /* Offset 16 */ reg32_t _unused_esp; /* Offset 20 */ reg32_t ebx; /* Offset 24 */ reg32_t edx; /* Offset 28 */ reg32_t ecx; /* Offset 32 */ reg32_t eax; /* Offset 36 */
reg32_t eflags; /* Offset 40 */ } com32sys_t;
This is simply the image created on the stack by the sequence (in NASM syntax):
_pm_call: pushfd pushad push ds push es push fs push gs
This has been shown to be amazingly versatile, especially since the 16-bit register image can be not just observed but written directly.
One can implement this either with or without a stack switch (to do so without a stack switch, the protected-mode ESP is computed from SS:SP). However, since real-mode stacks tend to be very small -- often only a few hundred bytes -- it is probably a bad idea.
In Syslinux this is actually implemeted in form of a lower-level function which does indeed take an address in a register, so the two approaches are not mutually exclusive. The actual full implementation of the _pm_call routine looks like (note: this code assumes CS = 0).
; ; _pm_call: call PM routine in low memory from RM ; ; on stack = PM routine to call (a 32-bit address) ; ; ECX, ESI, EDI passed to the called function; ; EAX = EBP in the called function points to the stack frame ; which includes all registers (which can be changed if desired.) ; ; All registers and the flags saved/restored ; ; This routine is invoked by the pm_call macro. ; _pm_call: pushfd pushad push ds push es push fs push gs mov bp,sp mov ax,cs mov ebx,.pm mov ds,ax jmp enter_pm
bits 32 section .textnr .pm: ; EAX points to the top of the RM stack, which is EFLAGS test RM_FLAGSH,02h ; RM EFLAGS.IF jz .no_sti sti .no_sti: call [ebp+4*2+9*4+2] ; Entrypoint on RM stack mov bx,.rm jmp enter_rm
bits 16 section .text16 .rm: pop gs pop fs pop es pop ds popad popfd ret 4 ; Drop entrypoint
The entire file including the enter_pm/enter_rm functions can be seen at:
-hpa