Hi,
I´m using flashrom to program a N25Q128E flash with a FT2232H chip (via linux in a virtual machine, I hope the virtual machine does not alter the results too much).
As I found the programming time is quite long, I made some research.
Here are some suggestions about which I think it´s worth to discuss (I started with a programming time of 05:15 min and got down to 1:44 min):
1) Proposal: in ft2332_spi.c change ftdi-chunksize to 269
###### CODE Start #####
//Simon Buhrow: changed from 256 to 269 (1 Byte Flash CMD + 3 Byte Address + 256 Byte Flash-Data + 9 Byte FTDI-CMD)
if (ftdi_write_data_set_chunksize(ftdic, 269)) {
msg_perr("Unable to set chunk size (%s).\n", ftdi_get_error_string(ftdic));
}
###### CODE End #####
Background: Measurements with the oszilloscope let me see, that the transmission of a 256-Byte-page-write-paket takes about 4 ms.
Please check spi_rpd_flashrom_bad_page_write_01.png for that (blue = SCK, green = nCS). Above is the whole measurement and below a zoom.
First there is the zoomed 2000 bit = 250 Byte burst, followed by a 10 Byte burst (= 260 Byte => 1 Byte Flash Cmd + 3 Byte Address + 256 Byte Flash data).
The ftdi_write_data_chunksize refers to all bytes send to the ftdi-chip and not only the raw-flash-data. So the first 256 Bytes of the actually 269 Bytes
(6 Byte FTDI-Start-CMD + 1 Byte Flash CMD + 3 Byte Address + 256 Byte Flash-Data + 3 Byte FTDI-CMD) are send to ftdi-chip in a first paket.
That´s why it sends only 250 Bytes in a first spi burst (as there are 6 Bytes of ftdi-cmds first followed by the flash cmd, address and raw data) and 10 Bytes in a second spi burst.
Changing the value as proposed above the transmission of a 256-Byte-page-write-paket takes just about 80µs.
Please check spi_rpd_flashrom_good_page_write.png for validation.
I did not check the USB-Communication (for efficiency of paket sizes).
But as you can see at spi-level it looks much better. And the total programming time was (in my case) about 30 seconds faster.
2) Proposal: add a flashrom cli parameter to select the block_erasers num
May be there is already a similar functionality. But as I can see, the differen block_erasers given in flashchips.c are always used in the manner, that if the first entry works, the first entry is used and that´s it.
In my case the second erase-function is the fastest (as I do not write the whole flash). Of course the work around is to change the order in flashchips.c, but may be there could be realized a more comfortable solution?
3) Wish: accelerate spi_read_status_register() after program data
Please check spi_rpd_flashrom_bad_page_write_01.png again to see that reading status-register takes quite a long time as well (at the beginning there is 1 Byte send to flash and 2 Byte read from flash and then just idle time).
As it is called quite a lot of times it really matters to timing.
Looking some deeper I found that libusb_bulk_transfer() which is called from ftdi_read_data() is the bottle_neck (simple measured with prints).
I did not get deeper. So I can´t say if the ftdi-chip is bad configured or is too slow or if libusb is not that efficient. To get a good solution for that would be fine.
As a simple workaround for shorter programming time I propose - in case of program operation - to not call spi_read_status_register() but just wait a short time (1ms) (may be configurable with a cli parameter or via data at flashchips.c).
Why? - I think spi_read_status_register() or better the calling spi25.c:spi_poll_wip() does not has any benefit in my case.
It does only check for wip. But as you can see on the scope image it is done several ms after page-program. At least my flash does take much less than 1ms for a page program.
Replacing the spi_poll_wip call for all program operations (not erase operations, as they last longer) by a 1ms programmer_delay speed up my programming time much more than 1 minute!!
Here is what I did: I added a parameter to spr_write_cmd to indicate if spi_poll_wip() shall be done or not:
###### CODE Start #####
static int spi_write_cmd(..., const unsigned int ignore_spi_poll_wip)
{
...
if (ignore_spi_poll_wip == 0){
status = spi_poll_wip(flash, poll_delay);
}
else{
programmer_delay(1000);
}
}
...
spi_nbyte_program(struct flashctx *flash, unsigned int addr, const uint8_t *bytes, unsigned int len)
{
...
return spi_write_cmd(..., 1); // Simon Buhrow: added ,1 to indicate to not call read_status_register
}
###### CODE End #####
May be there are any ideas about making this more general and cleaner?
What do you think?
Regards,
Simon
Signed-off-by: Simon Buhrow <simon.buhrow(a)sieb-meyer.de>