Dave Ashley linuxbios@xdr.com writes:
These discussions on romcc running out of registers, optimizations, inline and all are surprising. Isn't the point of romcc to just get the system ram up right at the beginning?
Yes.
Then once that is done any compiler (gcc) can be used to write code. Because dram configuration is so complicated and interpreting the SPD data from the dram, people would rather code that in 'C' than in x86 asm, so that's the purpose of romcc?
Yes.
The most common problem is for people to write a piece of code that romcc cannot figure out how to make fit in only a limited set of registers. That is fairly rare at the moment but it does happen.
To reduce register pressure romcc inlines all functions. So there is no need to store a return address anywhere.
Inlining everything leads to a case where code generated by romcc is 3x larger than hand coded assembly for a similar problem. Last I checked when using both sse and mmx registers on the Opteron port I had about 8 free registers most of the time. And my average call depth is less than 8. So it looks reasonable to actually store a return address and cut down on register pressure.
The reason register allocation is hard is that it is an NP complete problem, which would take exponential time in the size of the program if I was to implement a perfect algorithm. Given that a O(N^2) algorithm takes a full minute using the perfect algorithm is unreasonable given that my current heuristic works almost every time. Taking any longer would mess up a programmers productivity. There is one known corner case where the current heuristic fails that I would like very much to improve.
But these are the only problems. And once I have finished exploring them romcc will be pretty much done. I would not work quite as much on them except I find solving the problems quite fun :)
Eric