On 2013-Jan-7 14:08 , Mark Cave-Ayland wrote:
.

Ah I see. So for example if on SPARC that was in the middle of a CIF interpret call or similar, then you'd be dropped back to the OpenBoot prompt at this point too? I didn't pick up on that from Artyom's original email.

Yup. Either drop to the ok prompt or fully reset the machine, depending on other settings.

Unfortunately the sad truth is that BootX is written to rely on this (ahem) feature to work if booting from anything that isn't the network:

 0 value screenIH
 0 value cursorAddr
 0 value cursorX
 0 value cursorY
 0 value cursorW
 0 value cursorH
 0 value cursorFrames
 0 value cursorPixelSize
 0 value cursorStage
 0 value cursorTime
 0 value cursorDelay

 ...

 : slw_spin_init
   dup FFFF and to cursorH 10 >> drop
   dup FFFF and to cursorW 10 >> to cursorPixelSize
   dup FFFF and to cursorY 10 >> d# 1000 swap / to cursorDelay
   dup FFFF and to cursorX 10 >> to cursorFrames
   to cursorAddr
   to screenIH
   ['] slw_spin to spin ;

And slw_spin_init is invoked from DrawSplashScreen() like this:

if (gBootFileType != kNetworkDeviceType) {
    SpinInit(0, 0, NULL, 0, 0, 0, 0, 0, 0, 0);
}

The key error here (as I see it) is:

   dup FFFF and to cursorY 10 >> d# 1000 swap / to cursorDelay

Where is that code? It looks like it's the third argument to "slw_spin_init" that is being used as a divisor, and it should be checked for zero.

I think this code (commenting to relate arguments to lines of code) looks like:

 : slw_spin_init ( arg7 arg6 arg5 arg4 arg3 arg2 arg1 -- )
   dup FFFF and to cursorH 10 >> drop                    ( arg7 arg6 arg5 arg4 arg3 arg2 )
   dup FFFF and to cursorW 10 >> to cursorPixelSize      ( arg7 arg6 arg5 arg4 arg3 )
   dup FFFF and to cursorY 10 >> d# 1000 swap / to cursorDelay ( arg7 arg6 arg5 arg4 )
   dup FFFF and to cursorX 10 >> to cursorFrames         ( arg7 arg6 arg5 )
   to cursorAddr                                         ( arg7 arg6 )
   to screenIH                                           ( arg7 )
   ['] slw_spin to spin ;                                ( )
;

So it looks like this code depends on arg3 being non-zero. I assume the arguments are coming from right-to-left, which is backwards from other c-to-forth implementations I've seen (usually the c call is procedure(arg1, arg2, arg3, ...) ), but that would not seem to fit here, since arg5 appears to require a pointer. Note that not all architectures use 0 == NULL; some use other values, so I wonder if this is a bug originating from some such usage combined with argument order confusion.

The better fix would be to check for zero in slw_spin_init. Or are you saying that both the call to SpinInit() and the slw_spin_init( -- ) forth code are in BootX, code we aren't allowed to touch? If that's the case, indeed, a program which deliberately sets up a divide by zero expecting it to work has to be hacked around. To quote from that PPC 8360 article:

"I've inherited legacy code like this before & I feel your pain. You want to shake your fist at the people who installed such bone-headed behavior, but right now shaking your fist doesn't help you ship product. You need a solution. Good luck."