I've run into an interesting problem doing some "bait and switch" with my emulator.
I have machine code for an x86 program that expects to run in protected mode. It's not an ELF file, so I can't load it directly with elfboot(). What I'm doing for the time being is having elfboot load Etherboot, with the emulator set to break at the Etherboot entry address. When the breakpoint is hit I load the machine code into memory and change EIP to its entry point. (Ultimately I'll probably create an ELF wrapper for this code, but I don't have one yet).
My machine code gets a protection exception when it tries to set one of the segment registers (to the value it already has, BTW). I traced this to the fact that Etherboot was loaded on top of the GDT used by LinuxBIOS.
We can argue about what kind of assumptions payloads should make about their runtime environment, but it seems to me that being in protected mode without a GDT is a bomb waiting to go off. Some payloads are bound to do things in a sequence that causes an explosion.
Can we move the GDT within the memory LB reports as unusable, say, before the tables?
Steve www.digidescorp.com
On Wed, 3 Aug 2005, Steve Magnani wrote:
My machine code gets a protection exception when it tries to set one of the segment registers (to the value it already has, BTW). I traced this to the fact that Etherboot was loaded on top of the GDT used by LinuxBIOS.
cute. Now, the great part is, this is the first time we've ever been bitten by this one, yet it is clearly quite a fine, healthy, nasty bug.
OK, we'll have to fix this one :-)
ron
"Steve Magnani" steve@digidescorp.com writes:
I've run into an interesting problem doing some "bait and switch" with my emulator.
I have machine code for an x86 program that expects to run in protected mode. It's not an ELF file, so I can't load it directly with elfboot(). What I'm doing for the time being is having elfboot load Etherboot, with the emulator set to break at the Etherboot entry address. When the breakpoint is hit I load the machine code into memory and change EIP to its entry point. (Ultimately I'll probably create an ELF wrapper for this code, but I don't have one yet).
My machine code gets a protection exception when it tries to set one of the segment registers (to the value it already has, BTW). I traced this to the fact that Etherboot was loaded on top of the GDT used by LinuxBIOS.
Etherboot a few instructions later will load it's own GDT.
We can argue about what kind of assumptions payloads should make about their runtime environment, but it seems to me that being in protected mode without a GDT is a bomb waiting to go off. Some payloads are bound to do things in a sequence that causes an explosion.
Not really. I don't think there are any processor operations that automatically reload segment registers. In fact segment registers are so useless that in 64bit mode they are not even implemented. If you want to mess with the segment registers you should really be loading your own.
The values in the segments persist in the segment shadow registers. And that is where the values matter. It is unfortunate that the registers that matter are write only registers though.
Can we move the GDT within the memory LB reports as unusable, say, before the tables?
I looked at the code and we aren't nuking the gdt deliberately. But I am not really comfortable with applications caring about it. At one point I was thinking it might be sane to return to LinuxBIOS but there don't appear to be any uses for that.
The sane thing for an application to expect in 32bit protected mode is flat 32bit segments with a base of 0. That is enough to get things done. If you need something more I would like to hear why.
We really can't sanely fix this when you are doing something that seems to have no real justification.
Eric
Eric,
I admit that my current situation is out of the mainstream. But it's uncovered an issue that could affect others doing more mainstream things.
You write:
The sane thing for an application to expect in 32bit protected mode is flat 32bit segments with a base of 0. That is enough to get things done. If you need something more I would like to hear why.
The code I'm having trouble with is defensive in nature. It is attempting to put the processor into a known state, in case some of the segment registers or flags were not initialized in the switch to protected mode. The code fails because LinuxBIOS is leaving the processor in an inconsistent state (ring 0 protected mode with an invalid GDT backing it). I think this demonstrates both the need for defensive programming by payload writers, and the potential pitfalls of LinuxBIOS doing things that are unexpected.
It is always good to ask whether the cost of NOT making a change outweighs the cost of making it. As Ron points out, I'm the first to trip over this (lucky me). If LB continues as it is, will I be the last? No. That's the "any problem you've ever run into is somewhere on the Internet" syndrome.
I can certainly resolve my problem by writing a wrapper that sets up a GDT before branching to the "problem" code. But that only helps me. I'd prefer to have a solution that helps everyone.
The cost of changing LB to maintain a consistent GDT is that one developer has to make the changes to accomplish this. Others probably review the change. That's less work now than it will be in future, because the issues are fresh.
The cost of NOT changing LB is that N future payload developers discover the problem and have to implement workarounds. The larger part of that cost is probably tracking down what's causing the problem in the first place.
I would argue that while N might be increasing very slowly, eventually the cost to the LB community of NOT fixing the GDT will outweigh the cost of fixing it.
An alternative (but to me, a less desirable one) would be to find some way to make it obvious to payload developers what kind of environment they can count on (i.e. document the GDT issue, making it a 'feature' rather than a 'bug'). You could argue that this e-mail thread does that, but I don't think it will help anyone who hasn't already run into the issue and spent time finding its cause.
Steve