Hi,
the problems Jason is seeing with slow boot can be explained with our incomplete and less than efficient MTRR setup.
We never use subtractive MTRRs, so if a range is not power-of-two sized, we try to combine it from subranges which are power-of-two sized. For large ranges which are a little smaller than a power of two, we waste MTRRs and in some cases even run out of MTRRs. That's not something to be proud of.
Example: We want to cache 0MB - (2G-64M-64k). Current setup: reg00: base=0x00000000 ( 0MB), size=1024MB: write-back, count=1 reg01: base=0x40000000 (1024MB), size= 512MB: write-back, count=1 reg02: base=0x60000000 (1536MB), size= 256MB: write-back, count=1 reg03: base=0x70000000 (1792MB), size= 128MB: write-back, count=1 reg04: base=0x78000000 (1920MB), size= 32MB: write-back, count=1 reg05: base=0x7a000000 (1952MB), size= 16MB: write-back, count=1 reg06: base=0x7b000000 (1968MB), size= 8MB: write-back, count=1 reg07: base=0x7b800000 (1976MB), size= 4MB: write-back, count=1 --Here we run out of MTRRs, additionally needed MTRRs follow-- reg08: base=0x7bc00000 (1980MB), size= 2048kB: write-back, count=1 reg09: base=0x7be00000 (1982MB), size= 1024kB: write-back, count=1 reg10: base=0x7bf00000 (1983MB), size= 512kB: write-back, count=1 reg11: base=0x7bf80000 (1983MB), size= 256kB: write-back, count=1 reg12: base=0x7bfc0000 (1983MB), size= 128kB: write-back, count=1 reg13: base=0x7bfe0000 (1983MB), size= 64kB: write-back, count=1
We could achieve the same effect with a subtractive setup: reg00: base=0x00000000 ( 0MB), size=2048MB: write-back, count=1 reg01: base=0x7c000000 (1984MB), size=64MB: uncached, count=1 reg02: base=0x7bff0000 (1983MB), size=64kB: uncached, count=1
However, a subtractive setup is not always more efficient. That means we have to select the best setup type. I devised a slightly tricky algorithm to do that: 1. Check if there are multiple disjoint cached areas in a given power-of-two sized area. 1a. If no, go to step 2 1b. If yes, stop here. Need advanced setup not described here. 2. additive_count=bitcount(top_cached_addr+1) 3. subtractive_count=bitcount(rounduptonextpowerof2(top_cached_addr)-(top_cached_addr+1)) 4. if (additive_count>subtractive_count) go to subtractive_method else go to additive_method
Regards, Carl-Daniel