Hello all,
the appended patch is single handedly able to enforce a successful compilation for the mainboard msi/ms6119, given that dump_spd_registers is activated in ms6119/auto.c. Without the patch, romcc aborts compilation due to lack of free processor registers.
A similar patch and some furher reductions have been applied to my work on ms6147, and there produced a bootable image.
I did try to introduce other reductions further into the code base, but I was not successful in getting compilable code until I removed the assignment
device = ctrl->channel0[0];
in favour of explicit calculations in this very file i440bx/debug.c.
Instead of the present patch, I also tried some variations in using a macro to perform the replacement of the original variable 'device' (meant to improve readablility), but some of those caused romcc to get stuck in a seemingly infinite loop, so there is something wrong with the parsing performed by romcc.
Now, I would like to know, if the kind of code that the present patch introduces is at all acceptable by the project. In particular, it is that repeated 'ctrl->channel0[i]' that comes to my mind. The additional if-clause that I have introduced, but not fully activated, does produce a further reduction of register use. Any final patch will also have to touch the code pertaining to the case of two memory channels, a case which is presently deactivated in the repository code base.
Best regards,
Mats Erik Andersson
---
Index: ../src/northbridge/intel/i440bx/debug.c =================================================================== --- ../src/northbridge/intel/i440bx/debug.c (revision 3547) +++ ../src/northbridge/intel/i440bx/debug.c (arbetskopia) @@ -4,29 +4,40 @@ int i; print_debug("\r\n"); for(i = 0; i < 4; i++) { - unsigned device; - device = ctrl->channel0[i]; - if (device) { + /* Repeated calculation of device value + * reduces the need of processor registers. */ + if (ctrl->channel0[i]) { int j; print_debug("dimm: "); print_debug_hex8(i); print_debug(".0: "); - print_debug_hex8(device); + print_debug_hex8(ctrl->channel0[i]); for(j = 0; j < 256; j++) { +#if 1 int status; - unsigned char byte; if ((j & 0xf) == 0) { print_debug("\r\n"); print_debug_hex8(j); print_debug(": "); } - status = spd_read_byte(device, j); + status = spd_read_byte(ctrl->channel0[i], j); if (status < 0) { print_debug("bad device\r\n"); break; } - byte = status & 0xff; - print_debug_hex8(byte); + print_debug_hex8(status & 0xff); +#else /* Further reduced use of processor registers */ + if ((j & 0xf) == 0) { + print_debug("\r\n"); + print_debug_hex8(j); + print_debug(": "); + } + if (spd_read_byte(ctrl->channel0[i], j) < 0) { + print_debug("bad device\r\n"); + break; + } + print_debug_hex8(spd_read_byte(ctrl->channel0[i], j) & 0xff); +#endif print_debug_char(' '); } print_debug("\r\n");
Mats Erik Andersson wrote:
I did try to introduce other reductions further into the code base, but I was not successful in getting compilable code until I removed the assignment
device = ctrl->channel0[0];
in favour of explicit calculations in this very file i440bx/debug.c.
Instead of the present patch, I also tried some variations in using a macro to perform the replacement of the original variable 'device' (meant to improve readablility), but some of those caused romcc to get stuck in a seemingly infinite loop, so there is something wrong with the parsing performed by romcc.
There are a couple of debug defines you can activate in romcc. But keep in mind that romcc indeed takes forever because it can not spill registers to any memory. I've seen romcc run 10 or more minutes before it failed/succeeded compilation.
Now, I would like to know, if the kind of code that the present patch introduces is at all acceptable by the project. In particular, it is that repeated 'ctrl->channel0[i]' that comes to my mind.
There's nothing principally wrong with it, but I suggest that for a chipset as simple as the 440BX, you drop the ctrl structure completely. The original author tried to mimic a scenario that is required on the AMD K8 where you have several memory controllers and the SPD roms are not assigned to fixed addresses. Instead of getting the actual values via indirections and arrays, just assume they're at 0xa0 + 2 * i and you'll be fine.
Also, did you evaluate the possibility of using cache as ram on the system? That would get rid of the register pressure completely because you can use the processor cache as your stack.