Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net writes:
On 30.10.2008 23:33, Marc Jones wrote:
Stefan Reinauer wrote:
On 30.10.2008, at 15:03, Marc Jones marc.jones@amd.com wrote:
I think that this is in all K8 and fam10 disable_car code. The copy to memory is probably good. Running from the rom and decompressing from the rom is going to thrash. It may affect some cpu/chipset combinations more than others.
Marc
So what's the reason for that?
off the top of my head:
- ROM access speed: lpc and or spi may be run at different speeds.
(faster on some platforms) 2. domain crossed - ht -> maybe pci/pcie -> pci/pcie -> lpc ( less is better) 3. prefetch and code alignment. Being a few bytes off can cause and prefetching to thrash.
It shouldn't thrash as bad if the ROM is being cached but I am not sure that it is or that it is setup correctly.
A system using SPI for its ROM at the usual speed without caching has a per-byte latency of ~3 microseconds. That translates to per-instruction latencies of roughly 6 microseconds for simple instructions accessing data in the ROM. That's roughly 500 times slower than an instruction in I-Cache accessing data in RAM.
Historically all of this was handled by setting up an MTRR over the ROM chip. I think the type was write-through, and I think the config option was uncompressed ROM size.
So if you are seeing slow decompression performance I'm guessing you can fix it with just a few config fiddles. But look at the early mtrr setup to be certain.
Eric