* Segher Boessenkool segher@chello.nl [020628 15:39]:
block byte per byte and use cell size copying for the rest. Don't know whether it's worth the overhead for the typical amount of data copied with move in an OF implementation.
Most probably it's worth the effort. Not too much effort, either:
MOVE --> memmove() MOVE> --> memcpy()
which just moves the implementation down a layer and speeds things up for host execution. If we want to get this thing to flash we have to do it ourselfes anyways, no matter whether we write it in forth or use a C/asm optimized memmove/memcpy
Graphics drivers need to be (partly) written in C or asm anyway (esp. the "blitter" parts), so as not too make the system feel sluggish.
Which was the reason why i proposed C written unaligned words.
is it legal for an fcode program to overload words defined within the lower space (i.e. 0x000-0x5ff)? If not, we can keep this list static and split the array up in 2 parts, one for the static data, one for the dynamic table that can be changed by user packages (i.e. containing fcode#s above 0x800) This would really speedup initialization when creating a new package or executing an fcode program using byte-load in general.
The FCode table is just (number of bits per cell) x 0.5kB big, and is only needed during package load, so I don't see the problem here?
The "problem" is that it takes paflof about 1.2secs per package on a 667MHz 21264 to initialize this table. Not really fatal, but I still want to see that compressed dictionary dumps are smaller than fcode drivers before I consider them the universal solution.
Stefan