On Wed, Sep 29, 2010 at 09:15:11PM -0700, H. Peter Anvin wrote:
On 09/29/2010 08:51 PM, Kevin O'Connor wrote:
On Wed, Sep 29, 2010 at 08:19:33AM -0700, H. Peter Anvin wrote: That brings up an interesting question. Do you know if instruction prefixes result in slower cpu execution? I know they bloat the code, but it's not been clear to me if there is a speed impact (besides a small cost to insn fetching). Real mode execution is documented to be slow, but it's unclear if using regular 32bit operations (via prefixes) would be even slower, the same, or a little faster.
On older processors they do slow down the CPU, on modern CPUs they generally don't, except for icache footprint, of course.
In general real mode execution is no slower than the corresponding protected mode.
Sorry - I meant: 16bit code segments are documented to be slow, but it's unclear if using regular 32bit instructions in them (via prefixes) would be even slower, the same, or faster.
BTW, I calculate that prefixes represent 12% of the seabios 16bit code size (4720 of 38920 bytes). I've found the code size with gcc (using prefixes) was smaller than the code size was with bcc - largely due to the optimizations and improved code structure that gcc enabled.
[...]
I worked for a while on a 16-bit gcc backend... I kind of stopped because of perceived lack of interest, and it wasn't fully usable yet, but perhaps Seabios would be enough of a reason. It seemed to produce code about 15% smaller than gcc with prefixes.
[...]
Oh, yes, there is of course also OpenWatcom, which I personally had high hopes for ... and it is quite a good 16-bit compiler, it's just that I've found the OW community to not always have goals compatible with my needs, which of course are for a decent cross-compiler.
I don't think either would be worthwhile just to shrink the 16bit code size.
Granted, the seabios memory macros (eg, GET_GLOBAL, GET_FARVAR, GET_FLATPTR) are ugly. However, since SeaBIOS handles real-mode, bigreal-mode, 16bit protected mode, 32bit "flat" mode, 32bit mode with segments, and 32bit segmented PIC mode, I doubt any normal 16bit compiler could handle all of these without macros anyway.
-Kevin