Ok. Non inline function calls are trickier than I expected. My intuition said I just needed to tweak the generated intermediate code and I could reuse all of my optimizing back end. In almost all cases my back end looks at the code generated declares it spaghetti and refuses to compile it.
However I did manage a lot of good general cleanups. I believe I have fix the problem with reading ints instead of bytes from structures. I have optimized the code somewhat so sub word size loads can now be into any of the 8 general purpose registers.
I have converted practically all the compile time options into run time switches so the code is a little cleaner and more configurable.
Before I return to working on non inlining I need to step back and think some more to see which technique will work best. Either improving my code analysis algorithms, or compiling the functions separately and then inserting calls. There are pluses and minuses to both techniques.
Hopefully I will be a little more active now that I have come to a stopping place.
Eric