It seems that romcc with -mcpu=p3 -O produces incorrect code for cpu/x86/16bit/reset16.inc. -mpcu=p2 -O does work fine.
I will try to provide more hard data later but it looks like the following code is compiled into an (slightly) incorrect jump which breaks things terribly, of course. .section ".reset" .code16 .globl reset_vector reset_vector: .byte 0xe9 .int _start - ( . + 2 ) . = 0x8; .code32
Jump seems to be off by -4 bytes.