I did some local optimizations and have some more info.
I disabled keyboard init in coreboot-v2 - saves 400ms.
I disabled ATA reset in SeaBIOS on first boot.
I exit out of a CBFS search in SeaBIOS if the file signature is zero (CBFS is zero padding instead of ff padding).
This brings the boot time on reset to 1.7 seconds (reset press to grub launch - doesn't include 2.5 seconds at boot menu). Boot from powerup takes a bit longer - 8.9s (7 seconds is for drive spinup).
Another optimization I could make would be to reduce the 90ms it takes to copy the vga rom from flash. Using 4 byte reads instead of 1 byte reads will likely help. However, enabling an mtrr for the flash would likely lead to the best results.
I'm not sure if I'll commit these changes - it is something to think about though.
Logs attached. This setup, for what ever reason, puts a null byte to serial on target machine power up and reset press (but not reset release, so resets are off a little bit). I'm using this to time the rest of the results.
-Kevin
On Sun, Apr 19, 2009 at 8:38 AM, Kevin O'Connor kevin@koconnor.net wrote:
I did some local optimizations and have some more info.
I disabled keyboard init in coreboot-v2 - saves 400ms.
I disabled ATA reset in SeaBIOS on first boot.
I exit out of a CBFS search in SeaBIOS if the file signature is zero (CBFS is zero padding instead of ff padding).
You could also only read it once and save the devid and address of ROMs if there are any.
This brings the boot time on reset to 1.7 seconds (reset press to grub launch - doesn't include 2.5 seconds at boot menu). Boot from powerup takes a bit longer - 8.9s (7 seconds is for drive spinup).
Another optimization I could make would be to reduce the 90ms it takes to copy the vga rom from flash. Using 4 byte reads instead of 1 byte reads will likely help. However, enabling an mtrr for the flash would likely lead to the best results.
We have been discussing this in another thread. The 4byte read will make the most improvement and the cache only helps a lot if the ROM is compressed. You are still bound by the width and speed of the ROM access. LPC/FWH is typically setup to be very slow.
Marc
On Mon, Apr 20, 2009 at 10:17:43AM -0600, Marc Jones wrote:
On Sun, Apr 19, 2009 at 8:38 AM, Kevin O'Connor kevin@koconnor.net wrote:
Another optimization I could make would be to reduce the 90ms it takes to copy the vga rom from flash. Using 4 byte reads instead of 1 byte reads will likely help. However, enabling an mtrr for the flash would likely lead to the best results.
We have been discussing this in another thread. The 4byte read will make the most improvement and the cache only helps a lot if the ROM is compressed. You are still bound by the width and speed of the ROM access. LPC/FWH is typically setup to be very slow.
I implemented a 4byte copy on Sunday night - it dropped the copy time from 90ms to 50ms. It's in the latest SeaBIOS git.
I was hoping that caching flash would allow the procesor to do cache-line sized reads. But, you have a good point about the LPC/FWH being a limit. Does anyone know if 50ms is the lower limit on reading 63488 bytes from flash? It still seems high.
-Kevin
Kevin O'Connor wrote:
But, you have a good point about the LPC/FWH being a limit. Does anyone know if 50ms is the lower limit on reading 63488 bytes from flash? It still seems high.
Memory Read is the cycle used by LPC chips, 1 byte at a time. Firmware Memory Read is used by FWH chips, 1/2/4/128 bytes at a time. Often only 1.
1 byte LPC needs 21 LPC clocks, 630 ns * 63488 = 39997440 ns = 40 ms. 1 byte FWH needs 19 LPC clocks, 570 ns * 63488 = 36188160 ns = 36 ms.
I think you're as close as you will get without multibyte Firmware Memory Read:
128 byte FWH needs 273 LPC clocks = 8.27 us * (63488/128) = 4.1 ms.
//Peter
On Tue, Apr 21, 2009 at 04:00:08AM +0200, Peter Stuge wrote:
Kevin O'Connor wrote:
But, you have a good point about the LPC/FWH being a limit. Does anyone know if 50ms is the lower limit on reading 63488 bytes from flash? It still seems high.
Memory Read is the cycle used by LPC chips, 1 byte at a time. Firmware Memory Read is used by FWH chips, 1/2/4/128 bytes at a time. Often only 1.
1 byte LPC needs 21 LPC clocks, 630 ns * 63488 = 39997440 ns = 40 ms. 1 byte FWH needs 19 LPC clocks, 570 ns * 63488 = 36188160 ns = 36 ms.
I think you're as close as you will get without multibyte Firmware Memory Read:
128 byte FWH needs 273 LPC clocks = 8.27 us * (63488/128) = 4.1 ms.
Thanks Peter.
I don't think my epia-cn has FWH. With FWH, will a cache-line fill result in a 128byte flash read, or does one have to implement special instructions to make use of it?
As a side note, things will also improve a little with lzma compression of the roms.
-Kevin
Kevin O'Connor wrote:
I don't think my epia-cn has FWH. With FWH, will a cache-line fill result in a 128byte flash read, or does one have to implement special instructions to make use of it?
As a side note, things will also improve a little with lzma compression of the roms.
Much to my surprise the VIA Epia's tend to use FWH mode flash devices. Take a look at your flash part number. There is not a way that I am aware of however to read the pin-strap on the VIA vt8237 to see how it is set for either LPC/FWH if your device supports dual mode.
-Bari
Kevin O'Connor wrote:
I think you're as close as you will get without multibyte Firmware Memory Read:
Thanks Peter.
I don't think my epia-cn has FWH.
You're right. EPIA-CN uses LPC flash.
With FWH, will a cache-line fill result in a 128byte flash read, or does one have to implement special instructions to make use of it?
That would be at the discretion of the LPC bus master, ie. the southbridge.
I speculate that no southbridge actually does this.
As a side note, things will also improve a little with lzma compression of the roms.
Yep, that's a good point.
//Peter