On 11.01.2008 02:38, Peter Stuge wrote:
On Sat, Jan 05, 2008 at 08:52:49PM -0800, ron minnich wrote:
On Jan 5, 2008 5:11 PM, Peter Stuge peter@stuge.se wrote:
On Sat, Jan 05, 2008 at 02:44:55PM -0800, ron minnich wrote:
NOT finding a file should be efficient.
Yes.
Where do the delay come from? Can anyone measure the LPC?
I will try to measure for you but remember it is running uncached. Each and every access is a full LPC cycle, which is not really fast.
True. It is definately LPC access time causing this.
Yes.
On Sun, Jan 06, 2008 at 05:17:31PM +0100, Carl-Daniel Hailfinger wrote:
IIRC the PCEngines alix.1c has parallel flash
Nope. Both the soldered-on flash and obviously the LPC port are LPC.
Even LPC alone is faster than LPC-to-slow-SPI-to-LPC.
for all boards with LPC-to-SPI translation through a IT8716F Super I/O chip, the following applies: My calculations suggest that for optimal reads (4 bytes at a 4 byte boundary) with the IT8716F SPI translation function, we need (1 byte opcode + 3 bytes addr + 4 bytes data)* 8 = 64 clock cycles on the SPI bus alone. With a maximum of 33 MHz for the SPI clock on that particular chip (MUST verify that the chip is set to 33 MHz and not 16 MHz!), this optimal read takes 2 usec. Smaller reads take at least 1.25 usec. On top of the pure SPI bus time we have to add the time to transmit and receive that data over LPC.
Say 5 us per byte. That's rather slow for us, yes.
Hopefully that's the worst case. Still horrible if we use 1 MByte ROMs - 5 seconds for reading the whole ROM is unacceptable.
On Mon, Jan 07, 2008 at 10:40:02AM +0100, Carl-Daniel Hailfinger wrote:
Well, we can't use CPU cache, but we can abuse CAR memory as cache
I like this.
I'll try to code up some generic LAR caching extension over the weekend unless someone beats me to it. (Hint, hint!)
- Explicit "stop searching here" marker.
Maybe. But I do not think this is very clean..
Not very clean, but extremely flash-friendly. And in contrast to a "skip n bytes" marker you can add a new LAR member in its place without erasing the marker (at least that's how my implementation performs).
On Mon, Jan 07, 2008 at 08:40:31AM -0800, ron minnich wrote:
I would rather not walk all of empty space ever at any time.
I think the cost of one walk is ok for the flexibility gained.
We have to make sure it is never more than one full walk. Everything else is embarrassing.
I like the end marker best.
I think the runtime-created index is much more fun. It also keeps the lar format very clean.
We can do both. It seems that at least for the runtime-created index/cache we all agree.
In practice, the marker is the most efficient method, but it is a bit of an abomination IMO. :p
Big question: Is v3 a matter of pride or speed or userfriendliness? I haven't decided yet. ;-)
How do we handle the integrity issue when a single flash block that contains the start of a lar file is erased at runtime? (Thus breaking a link in the list.)
If we really do erases at LB runtime, the person waiting for the erase surely has the time to rebuild the index/cache (which will probably need a fraction of the time needed for erase).
That re-introduces the walking cost if we want to be able to find any files beyond the erased block.
Maybe both run-time indexing and marker is good? But then there's no point in the marker. So only build the index when a hole is encountered.
Avoiding LPC cycles is always a reason to build an index/cache. L2 cache is orders of magnitude faster than LPC. For n LAR members, a failed lookup will either cost you ~n*20 LPC transactions with 32 bit each or n*2 L2 cache accesses. I'd claim a factor of 1000 speed improvement.
Regards, Carl-Daniel