Hi Sirs,
Nice to have your support. I get the serious problem on flashrom 0.9.2 on our server systems. We always have to run 3rd party manager application to call up flashrom for BIOS upgrade. However, flashrom would be failed on “Verifying flash” step because some of programmed data in flash part is different with ones in golden ROM image file on random offset addresses. This defect occurs on Red Hat 5 x64, SLES 10 x64 and SLES 11 x64. If I run flashrom tool manually without this 3rd application, it would always flash SPI part successfully. Due to the 3rd party application is confidential, it can not be provided to you for defect reproduction.
Q1: May I suspect that expected data that is written to mapped virtual memory space by flashrom is overwritten or interfered by other process from 3rd application before SPI host controller operates these data to SPI ROM?
Q2: May I suspect that expected data that is written to mapped virtual memory space by flashrom is not sync to physical memory space immediately when SPI host controller operates these data to SPI ROM?
Q3: In order to prevent other process to interfere /dev/mem, mapped virtual memory space or physical memory space that flashrom uses, would you have any idea or some slice codes to lock them during flashrom operation?
Q4: Would you have any idea to trace the root cause interfered flashrom?
Please advise me your professional comments. Thanks!!
Verbose message:
Programming flash
done.
COMPLETE.
Verifying flash... VERIFY FAILED at 0x00017ca4! Expected=0xff, Read=0x25, failed
byte count from 0x00000000-0x003fffff: 0x3a
Your flash chip is in an unknown state.
Get help on IRC at irc.freenode.net (channel #flashrom) or
mail flashrom@flashrom.org!
-------------------------------------------------------------------------------
DO NOT REBOOT OR POWEROFF!
Syntax:
./flashrom –w 24a.rom
Hardware configuration:
MB: AMD platform solution
South Bridge: AMD SP5100
Flash part: ST M25P32
NOS:
Reg Hat 5 X86_X64
SLES 10 X86_X64
SLES 11 X86_X64
Best Regards, Hony Chiang
Hi Hony,
we will help you.
On 11.08.2010 04:27, Hony.Chiang@mic.com.tw wrote:
I get the serious problem on flashrom 0.9.2 on our server systems. We always have to run 3rd party manager application to call up flashrom for BIOS upgrade.
Can you tell us the name of the manager application you are using?
However, flashrom would be failed on “Verifying flash” step because some of programmed data in flash part is different with ones in golden ROM image file on random offset addresses. This defect occurs on Red Hat 5 x64, SLES 10 x64 and SLES 11 x64. If I run flashrom tool manually without this 3rd application, it would always flash SPI part successfully.
This is good. It means that flashrom works fine if nothing else accesses the flash chip.
Due to the 3rd party application is confidential, it can not be provided to you for defect reproduction.
Q1: May I suspect that expected data that is written to mapped virtual memory space by flashrom is overwritten or interfered by other process from 3rd application before SPI host controller operates these data to SPI ROM?
Q2: May I suspect that expected data that is written to mapped virtual memory space by flashrom is not sync to physical memory space immediately when SPI host controller operates these data to SPI ROM?
flashrom does not write to the mapped memory space of the flash chip on SP5100. flashrom uses only the SPI host controller registers to read/write. If any other software accesses (read or write) the mapped memory space of the flash chip or the SPI host controller registers at the same time, the SPI host controller registers will change in unexpected ways. This can lead to corruption of the flash chip or corruption of reads.
Q3: In order to prevent other process to interfere /dev/mem, mapped virtual memory space or physical memory space that flashrom uses, would you have any idea or some slice codes to lock them during flashrom operation?
Locking under most operating systems is not mandatory, and this means even if flashrom asks for a lock, all other applications can ignore the lock. I don't know any application which does locking on /dev/mem because that might interfere with X.org graphics and other software.
Q4: Would you have any idea to trace the root cause interfered flashrom?
I wrote a patch which should be able to detect interference from other applications. I will send it later once it is tested.
Verbose message: Programming flash done. COMPLETE. Verifying flash... VERIFY FAILED at 0x00017ca4! Expected=0xff, Read=0x25, failed byte count from 0x00000000-0x003fffff: 0x3a
Syntax: ./flashrom –w 24a.rom
Hardware configuration: MB: AMD platform solution South Bridge: AMD SP5100 Flash part: ST M25P32
NOS: Reg Hat 5 X86_X64 SLES 10 X86_X64 SLES 11 X86_X64
Could you please send the output of "flashrom -V" so I can check the contents of some SP5100 registers? Thanks.
Regards, Carl-Daniel
Hi Daniel,
Thanks for your quick response. Please kindly see my answers with blue texts as below and share you my experiments today. I use AMD HDT hardware ICE to interrupt CPU for examining data on physical address 0xFEC10000 ~ 0xFEC100F used by SPI host controller registers today. They are changed once flashrom is running, and I can also capture SPI opcode commands on 0xFEC10000 address. However, they are fixed even though 3rd application is running. It means that registers are not overwritten by this application. I have added code to set process prority -20 in flashrom, but the failure is still there.
Best regards, Hony ________________________________
From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: 2010/8/11 [星期三] 下午 09:05 To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
we will help you.
On 11.08.2010 04:27, Hony.Chiang@mic.com.tw wrote:
I get the serious problem on flashrom 0.9.2 on our server systems. We always have to run 3rd party manager application to call up flashrom for BIOS upgrade.
Can you tell us the name of the manager application you are using?
UpdateXpress System Pack Installer through a graphical user interface (GUI). http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp?topic=/uxsp...
However, flashrom would be failed on “Verifying flash” step because some of programmed data in flash part is different with ones in golden ROM image file on random offset addresses. This defect occurs on Red Hat 5 x64, SLES 10 x64 and SLES 11 x64. If I run flashrom tool manually without this 3rd application, it would always flash SPI part successfully.
This is good. It means that flashrom works fine if nothing else accesses the flash chip.
Due to the 3rd party application is confidential, it can not be provided to you for defect reproduction.
Q1: May I suspect that expected data that is written to mapped virtual memory space by flashrom is overwritten or interfered by other process from 3rd application before SPI host controller operates these data to SPI ROM?
Q2: May I suspect that expected data that is written to mapped virtual memory space by flashrom is not sync to physical memory space immediately when SPI host controller operates these data to SPI ROM?
flashrom does not write to the mapped memory space of the flash chip on SP5100. flashrom uses only the SPI host controller registers to read/write. If any other software accesses (read or write) the mapped memory space of the flash chip or the SPI host controller registers at the same time, the SPI host controller registers will change in unexpected ways. This can lead to corruption of the flash chip or corruption of reads.
Q3: In order to prevent other process to interfere /dev/mem, mapped virtual memory space or physical memory space that flashrom uses, would you have any idea or some slice codes to lock them during flashrom operation?
Locking under most operating systems is not mandatory, and this means even if flashrom asks for a lock, all other applications can ignore the lock. I don't know any application which does locking on /dev/mem because that might interfere with X.org graphics and other software.
Q4: Would you have any idea to trace the root cause interfered flashrom?
I wrote a patch which should be able to detect interference from other applications. I will send it later once it is tested.
Thank you. I will try it to give you more inputs.
Verbose message: Programming flash done. COMPLETE. Verifying flash... VERIFY FAILED at 0x00017ca4! Expected=0xff, Read=0x25, failed byte count from 0x00000000-0x003fffff: 0x3a
Syntax: ./flashrom –w 24a.rom
Hardware configuration: MB: AMD platform solution South Bridge: AMD SP5100 Flash part: ST M25P32
NOS: Reg Hat 5 X86_X64 SLES 10 X86_X64 SLES 11 X86_X64
Could you please send the output of "flashrom -V" so I can check the contents of some SP5100 registers? Thanks. I will capture message with -V parameter on my lab tomorrow, and will provide you output. Regards, Carl-Daniel
Hi Daniel,
Here is verbose log about SPI registers for reference.
Best Regards,
Hony
________________________________
From: hony.chiang (江昆仲 - MIC) Sent: Wednesday, August 11, 2010 11:23 PM To: Carl-Daniel Hailfinger Cc: flashrom@flashrom.org; hony.chiang (江昆仲 - MIC) Subject: RE: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Daniel,
Thanks for your quick response. Please kindly see my answers with blue texts as below and share you my experiments today.
I use AMD HDT hardware ICE to interrupt CPU for examining data on physical address 0xFEC10000 ~ 0xFEC100F used by SPI host controller registers today. They are changed once flashrom is running, and I can also capture SPI opcode commands on 0xFEC10000 address. However, they are fixed even though 3rd application is running. It means that registers are not overwritten by this application. I have added code to set process prority -20 in flashrom, but the failure is still there.
Best regards,
Hony
________________________________
From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: 2010/8/11 [星期三] 下午 09:05 To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
we will help you.
On 11.08.2010 04:27, Hony.Chiang@mic.com.tw wrote:
I get the serious problem on flashrom 0.9.2 on our server systems. We always have to run 3rd party manager application to call up flashrom for BIOS upgrade.
Can you tell us the name of the manager application you are using?
UpdateXpress System Pack Installer through a graphical user interface (GUI). http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp?topic=/uxsp...
However, flashrom would be failed on “Verifying flash” step because some of programmed data in flash part is different with ones in golden ROM image file on random offset addresses. This defect occurs on Red Hat 5 x64, SLES 10 x64 and SLES 11 x64. If I run flashrom tool manually without this 3rd application, it would always flash SPI part successfully.
This is good. It means that flashrom works fine if nothing else accesses the flash chip.
Due to the 3rd party application is confidential, it can not be provided to you for defect reproduction.
Q1: May I suspect that expected data that is written to mapped virtual memory space by flashrom is overwritten or interfered by other process from 3rd application before SPI host controller operates these data to SPI ROM?
Q2: May I suspect that expected data that is written to mapped virtual memory space by flashrom is not sync to physical memory space immediately when SPI host controller operates these data to SPI ROM?
flashrom does not write to the mapped memory space of the flash chip on SP5100. flashrom uses only the SPI host controller registers to read/write. If any other software accesses (read or write) the mapped memory space of the flash chip or the SPI host controller registers at the same time, the SPI host controller registers will change in unexpected ways. This can lead to corruption of the flash chip or corruption of reads.
Q3: In order to prevent other process to interfere /dev/mem, mapped virtual memory space or physical memory space that flashrom uses, would you have any idea or some slice codes to lock them during flashrom operation?
Locking under most operating systems is not mandatory, and this means even if flashrom asks for a lock, all other applications can ignore the lock. I don't know any application which does locking on /dev/mem because that might interfere with X.org graphics and other software.
Q4: Would you have any idea to trace the root cause interfered flashrom?
I wrote a patch which should be able to detect interference from other applications. I will send it later once it is tested.
Thank you. I will try it to give you more inputs.
Verbose message: Programming flash done. COMPLETE. Verifying flash... VERIFY FAILED at 0x00017ca4! Expected=0xff, Read=0x25, failed byte count from 0x00000000-0x003fffff: 0x3a
Syntax: ./flashrom –w 24a.rom
Hardware configuration: MB: AMD platform solution South Bridge: AMD SP5100 Flash part: ST M25P32
NOS: Reg Hat 5 X86_X64 SLES 10 X86_X64 SLES 11 X86_X64
Could you please send the output of "flashrom -V" so I can check the contents of some SP5100 registers? Thanks. I will capture message with -V parameter on my lab tomorrow, and will provide you output. Regards, Carl-Daniel
Hi Hony,
my suspicion is that either the third party application may access flash regions between 0xFFC00000 and 0xFFFFFFFF or that the builtin management engine in the southbridge may be asked by the third party application to access those flash regions or run those commands.
You wrote that the problem does not exist if the third party application (IBM UpdateXpress System Pack Installer) is not running. I see three ways to handle this: - Always terminate UpdateXpress before running flashrom. - Have flashrom detect a running UpdateXpress instance and freeze it as long as it accesses flash. - Ask IBM to modify UpdateXpress in a way that does not access flash unless explicitly requested.
As an alternative, we could add paranoid checks to the SB600 SPI driver and hope that those checks will help detect the issue. Once we detect the issue, we can print a warning and tell the user to stop all other programs accessing the flash.
I will send a SB600 paranoid checks patch as reply to this mail.
Regards, Carl-Daniel
On 12.08.2010 09:19, Hony.Chiang@mic.com.tw wrote:
Hi Daniel,
Here is verbose log about SPI registers for reference.
Best Regards,
Hony
From: hony.chiang (江昆仲 - MIC) Sent: Wednesday, August 11, 2010 11:23 PM To: Carl-Daniel Hailfinger Cc: flashrom@flashrom.org; hony.chiang (江昆仲 - MIC) Subject: RE: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Daniel,
Thanks for your quick response. Please kindly see my answers with blue texts as below and share you my experiments today.
I use AMD HDT hardware ICE to interrupt CPU for examining data on physical address 0xFEC10000 ~ 0xFEC100F used by SPI host controller registers today. They are changed once flashrom is running, and I can also capture SPI opcode commands on 0xFEC10000 address. However, they are fixed even though 3rd application is running. It means that registers are not overwritten by this application. I have added code to set process prority -20 in flashrom, but the failure is still there.
Best regards,
Hony
From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: 2010/8/11 [星期三] 下午 09:05 To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
we will help you.
On 11.08.2010 04:27, Hony.Chiang@mic.com.tw wrote:
I get the serious problem on flashrom 0.9.2 on our server systems. We always have to run 3rd party manager application to call up flashrom for BIOS upgrade.
Can you tell us the name of the manager application you are using?
UpdateXpress System Pack Installer through a graphical user interface (GUI). http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp?topic=/uxsp...
However, flashrom would be failed on “Verifying flash” step because some of programmed data in flash part is different with ones in golden ROM image file on random offset addresses. This defect occurs on Red Hat 5 x64, SLES 10 x64 and SLES 11 x64. If I run flashrom tool manually without this 3rd application, it would always flash SPI part successfully.
This is good. It means that flashrom works fine if nothing else accesses the flash chip.
Due to the 3rd party application is confidential, it can not be provided to you for defect reproduction.
Q1: May I suspect that expected data that is written to mapped virtual memory space by flashrom is overwritten or interfered by other process from 3rd application before SPI host controller operates these data to SPI ROM?
Q2: May I suspect that expected data that is written to mapped virtual memory space by flashrom is not sync to physical memory space immediately when SPI host controller operates these data to SPI ROM?
flashrom does not write to the mapped memory space of the flash chip on SP5100. flashrom uses only the SPI host controller registers to read/write. If any other software accesses (read or write) the mapped memory space of the flash chip or the SPI host controller registers at the same time, the SPI host controller registers will change in unexpected ways. This can lead to corruption of the flash chip or corruption of reads.
Q3: In order to prevent other process to interfere /dev/mem, mapped virtual memory space or physical memory space that flashrom uses, would you have any idea or some slice codes to lock them during flashrom operation?
Locking under most operating systems is not mandatory, and this means even if flashrom asks for a lock, all other applications can ignore the lock. I don't know any application which does locking on /dev/mem because that might interfere with X.org graphics and other software.
Q4: Would you have any idea to trace the root cause interfered flashrom?
I wrote a patch which should be able to detect interference from other applications. I will send it later once it is tested.
Thank you. I will try it to give you more inputs.
Verbose message: Programming flash done. COMPLETE. Verifying flash... VERIFY FAILED at 0x00017ca4! Expected=0xff, Read=0x25, failed byte count from 0x00000000-0x003fffff: 0x3a
Syntax: ./flashrom –w 24a.rom
Hardware configuration: MB: AMD platform solution South Bridge: AMD SP5100 Flash part: ST M25P32
NOS: Reg Hat 5 X86_X64 SLES 10 X86_X64 SLES 11 X86_X64
Could you please send the output of "flashrom -V" so I can check the contents of some SP5100 registers? Thanks. I will capture message with -V parameter on my lab tomorrow, and will provide you output. Regards, Carl-Daniel
flashrom mailing list flashrom@flashrom.org http://www.flashrom.org/mailman/listinfo/flashrom
Hi Hony,
here is the patch I talked about. If your mailer corrupts the patch, you can find it at the top of http://patchwork.coreboot.org/project/flashrom/list/ (look for "SB600 SPI paranoid checks"), click on the link and then select "download patch".
On 13.08.2010 03:25, Carl-Daniel Hailfinger wrote:
my suspicion is that either the third party application may access flash regions between 0xFFC00000 and 0xFFFFFFFF or that the builtin management engine in the southbridge may be asked by the third party application to access those flash regions or run those commands.
You wrote that the problem does not exist if the third party application (IBM UpdateXpress System Pack Installer) is not running. I see three ways to handle this:
- Always terminate UpdateXpress before running flashrom.
- Have flashrom detect a running UpdateXpress instance and freeze it as
long as it accesses flash.
- Ask IBM to modify UpdateXpress in a way that does not access flash
unless explicitly requested.
As an alternative, we could add paranoid checks to the SB600 SPI driver and hope that those checks will help detect the issue. Once we detect the issue, we can print a warning and tell the user to stop all other programs accessing the flash.
I will send a SB600 paranoid checks patch as reply to this mail.
Add paranoid checks for the essential readcnt/writecnt registers in the SB600/SB700/... SPI driver. This will detect some concurrent access, but not all.
If you want to check which application is accessing /dev/mem, please run (as root) fuser -v /dev/mem You will see Xorg (which is OK), but you probably will see another application as well. Can you please send the output of the above command? It would be a good idea to strace that command from startup, and check the strace log for open(/dev/mem) and subsequent mmap calls on the fd for /dev/mem. Then we know exactly in which region the access is happening. A good way is to run it as strace -ff -F -o strace.log /bin/thirdparty_application and then strace.log.* will contain some useful info. grep for "open" and "mmap". This should help.
Please run flashrom with this patch while UpdateXpress is not running (should still work and print no warnings). Then please run flashrom a few times with this patch while UpdateXpress is running (should still have the same problems, but it should print warnings).
I would love to see the logs, and it would also be great if you could send a few logs and tell us about the typical addresses where writing fails or where the verify fails. Maybe there is a pattern.
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Index: flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c =================================================================== --- flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c (Revision 1137) +++ flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c (Arbeitskopie) @@ -77,6 +77,7 @@ /* First byte is cmd which can not being sent through FIFO. */ unsigned char cmd = *writearr++; unsigned int readoffby1; + unsigned char readwrite;
writecnt--;
@@ -102,7 +103,8 @@ * It is unclear if the CS# line is set high too early as well. */ readoffby1 = (writecnt) ? 0 : 1; - mmio_writeb((readcnt + readoffby1) << 4 | (writecnt), sb600_spibar + 1); + readwrite = (readcnt + readoffby1) << 4 | (writecnt); + mmio_writeb(readwrite, sb600_spibar + 1); mmio_writeb(cmd, sb600_spibar + 0);
/* Before we use the FIFO, reset it first. */ @@ -149,6 +151,14 @@ msg_pspew("[%02x]", *readarr); } msg_pspew("\n"); + if (mmio_readb(sb600_spibar + 1) != readwrite) { + msg_perr("Unexpected change in SB600 read/write count! " + "Something else is accessing the flash chip and " + "causes random corruption. Please stop all " + "applications and drivers which access the flash " + "chip.\n"); + /* FIXME: Abort here? */ + }
return 0; } @@ -158,6 +168,10 @@ struct pci_dev *smbus_dev; uint32_t tmp; uint8_t reg; + const char *speed_names[4] = { + "Reserved", "33", "22", "16.5" + }; + /* Read SPI_BaseAddr */ tmp = pci_read_long(dev, 0xa0); tmp &= 0xffffffe0; /* remove bits 4-0 (reserved) */ @@ -183,15 +197,25 @@ msg_pdbg("PrefetchEnSPIFromIMC=%i, ", tmp);
tmp = pci_read_byte(dev, 0xbb); + /* FIXME: Set bit 3,6,7 if not already set. + * Set bit 5, otherwise SPI accesses are pointless in LPC mode. + * See doc 42413 AMD SB700/710/750 RPR. + */ msg_pdbg("PrefetchEnSPIFromHost=%i, SpiOpEnInLpcMode=%i\n", tmp & 0x1, (tmp & 0x20) >> 5); tmp = mmio_readl(sb600_spibar); + /* FIXME: If SpiAccessMacRomEn or SpiHostAccessRomEn are zero on + * SB700 or later, reads and writes will be corrupted. Abort in this + * case. Make sure to avoid this check on SB600. + */ msg_pdbg("SpiArbEnable=%i, SpiAccessMacRomEn=%i, " "SpiHostAccessRomEn=%i, ArbWaitCount=%i, " "SpiBridgeDisable=%i, DropOneClkOnRd=%i\n", (tmp >> 19) & 0x1, (tmp >> 22) & 0x1, (tmp >> 23) & 0x1, (tmp >> 24) & 0x7, (tmp >> 27) & 0x1, (tmp >> 28) & 0x1); + tmp = (mmio_readb(sb600_spibar + 0xd) >> 4) & 0x3; + msg_pdbg("NormSpeed is %s MHz\n", speed_names[tmp]);
/* Look for the SMBus device. */ smbus_dev = pci_dev_find(0x1002, 0x4385);
Hi Matthias, hi Hony,
you are both seeing a very similar issue. This mail contains a patch which should add the debugging we need.
New patch, with additional error checks.
Add paranoid checks for the essential readcnt/writecnt registers in the SB600/SB700/... SPI driver. This will detect most concurrent access, but not all.
I would love to see the logs in double verbose mode for a read:
flashrom -VV -r broken_fifo1.bin >broken_fifo1.txt 2>&1
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Index: flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c =================================================================== --- flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c (Revision 1144) +++ flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c (Arbeitskopie) @@ -77,6 +77,7 @@ /* First byte is cmd which can not being sent through FIFO. */ unsigned char cmd = *writearr++; unsigned int readoffby1; + unsigned char readwrite;
writecnt--;
@@ -102,18 +103,22 @@ * It is unclear if the CS# line is set high too early as well. */ readoffby1 = (writecnt) ? 0 : 1; - mmio_writeb((readcnt + readoffby1) << 4 | (writecnt), sb600_spibar + 1); + readwrite = (readcnt + readoffby1) << 4 | (writecnt); + mmio_writeb(readwrite, sb600_spibar + 1); mmio_writeb(cmd, sb600_spibar + 0);
/* Before we use the FIFO, reset it first. */ reset_internal_fifo_pointer();
/* Send the write byte to FIFO. */ + msg_pspew("Writing: "); for (count = 0; count < writecnt; count++, writearr++) { - msg_pspew(" [%x]", *writearr); + msg_pspew("[%02x]", *writearr); mmio_writeb(*writearr, sb600_spibar + 0xC); } msg_pspew("\n"); + msg_pspew("The FIFO pointer after writing is %d, wanted %d\n", + mmio_readb(sb600_spibar + 0xd) & 0x07, writecnt);
/* * We should send the data by sequence, which means we need to reset @@ -137,18 +142,31 @@ reset_internal_fifo_pointer();
/* Skip the bytes we sent. */ + msg_pspew("Skipping: "); for (count = 0; count < writecnt; count++) { cmd = mmio_readb(sb600_spibar + 0xC); - msg_pspew("[ %2x]", cmd); + msg_pspew("[%02x]", cmd); } + msg_pspew("\n"); + msg_pspew("The FIFO pointer after skipping is %d, wanted %d\n", + mmio_readb(sb600_spibar + 0xd) & 0x07, writecnt);
- msg_pspew("The FIFO pointer after skipping is %d.\n", - mmio_readb(sb600_spibar + 0xd) & 0x07); + msg_pspew("Reading: "); for (count = 0; count < readcnt; count++, readarr++) { *readarr = mmio_readb(sb600_spibar + 0xC); msg_pspew("[%02x]", *readarr); } msg_pspew("\n"); + msg_pspew("The FIFO pointer after reading is %d, wanted %d\n", + mmio_readb(sb600_spibar + 0xd) & 0x07, readcnt + writecnt); + if (mmio_readb(sb600_spibar + 1) != readwrite) { + msg_perr("Unexpected change in SB600 read/write count! " + "Something else is accessing the flash chip and " + "causes random corruption. Please stop all " + "applications and drivers which access the flash " + "chip.\n"); + /* FIXME: Abort here? */ + }
return 0; } @@ -158,6 +176,10 @@ struct pci_dev *smbus_dev; uint32_t tmp; uint8_t reg; + const char *speed_names[4] = { + "Reserved", "33", "22", "16.5" + }; + /* Read SPI_BaseAddr */ tmp = pci_read_long(dev, 0xa0); tmp &= 0xffffffe0; /* remove bits 4-0 (reserved) */ @@ -183,15 +205,25 @@ msg_pdbg("PrefetchEnSPIFromIMC=%i, ", tmp);
tmp = pci_read_byte(dev, 0xbb); + /* FIXME: Set bit 3,6,7 if not already set. + * Set bit 5, otherwise SPI accesses are pointless in LPC mode. + * See doc 42413 AMD SB700/710/750 RPR. + */ msg_pdbg("PrefetchEnSPIFromHost=%i, SpiOpEnInLpcMode=%i\n", tmp & 0x1, (tmp & 0x20) >> 5); tmp = mmio_readl(sb600_spibar); + /* FIXME: If SpiAccessMacRomEn or SpiHostAccessRomEn are zero on + * SB700 or later, reads and writes will be corrupted. Abort in this + * case. Make sure to avoid this check on SB600. + */ msg_pdbg("SpiArbEnable=%i, SpiAccessMacRomEn=%i, " "SpiHostAccessRomEn=%i, ArbWaitCount=%i, " "SpiBridgeDisable=%i, DropOneClkOnRd=%i\n", (tmp >> 19) & 0x1, (tmp >> 22) & 0x1, (tmp >> 23) & 0x1, (tmp >> 24) & 0x7, (tmp >> 27) & 0x1, (tmp >> 28) & 0x1); + tmp = (mmio_readb(sb600_spibar + 0xd) >> 4) & 0x3; + msg_pdbg("NormSpeed is %s MHz\n", speed_names[tmp]);
/* Look for the SMBus device. */ smbus_dev = pci_dev_find(0x1002, 0x4385);
Hi Matthias,
thanks for the test. Can you repeat the test with the following patch?
On 18.08.2010 13:04, Carl-Daniel Hailfinger wrote:
Hi Matthias, hi Hony,
you are both seeing a very similar issue. This mail contains a patch which should add the debugging we need.
New patch, with additional error checks.
Add paranoid checks for the essential readcnt/writecnt registers in the SB600/SB700/... SPI driver. This will detect most concurrent access, but not all.
I would love to see the logs in double verbose mode for a read:
flashrom -VV -r broken_fifo1.bin >broken_fifo1.txt 2>&1
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Index: flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c =================================================================== --- flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c (Revision 1144) +++ flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c (Arbeitskopie) @@ -58,10 +58,22 @@ { mmio_writeb(mmio_readb(sb600_spibar + 2) | 0x10, sb600_spibar + 2);
+ /* FIXME: This loop makes no sense at all. */ while (mmio_readb(sb600_spibar + 0xD) & 0x7) msg_pspew("reset\n"); }
+static void reset_compare_internal_fifo_pointer(uint8_t want) +{ + uint8_t tmp; + + tmp = mmio_readb(sb600_spibar + 0xd) & 0x07; + if ((want & 0x7) != tmp) { + msg_pdbg("The FIFO pointer is %d, wanted %d\n", tmp, want); + } + reset_internal_fifo_pointer(); +} + static void execute_command(void) { mmio_writeb(mmio_readb(sb600_spibar + 2) | 1, sb600_spibar + 2); @@ -77,6 +89,7 @@ /* First byte is cmd which can not being sent through FIFO. */ unsigned char cmd = *writearr++; unsigned int readoffby1; + unsigned char readwrite;
writecnt--;
@@ -102,15 +115,17 @@ * It is unclear if the CS# line is set high too early as well. */ readoffby1 = (writecnt) ? 0 : 1; - mmio_writeb((readcnt + readoffby1) << 4 | (writecnt), sb600_spibar + 1); + readwrite = (readcnt + readoffby1) << 4 | (writecnt); + mmio_writeb(readwrite, sb600_spibar + 1); mmio_writeb(cmd, sb600_spibar + 0);
/* Before we use the FIFO, reset it first. */ reset_internal_fifo_pointer();
/* Send the write byte to FIFO. */ + msg_pspew("Writing: "); for (count = 0; count < writecnt; count++, writearr++) { - msg_pspew(" [%x]", *writearr); + msg_pspew("[%02x]", *writearr); mmio_writeb(*writearr, sb600_spibar + 0xC); } msg_pspew("\n"); @@ -119,7 +134,7 @@ * We should send the data by sequence, which means we need to reset * the FIFO pointer to the first byte we want to send. */ - reset_internal_fifo_pointer(); + reset_compare_internal_fifo_pointer(writecnt);
execute_command();
@@ -134,22 +149,34 @@ * the opcode, the FIFO already stores the response from the chip. * Usually, the chip will respond with 0x00 or 0xff. */ - reset_internal_fifo_pointer(); + reset_compare_internal_fifo_pointer(writecnt);
/* Skip the bytes we sent. */ + msg_pspew("Skipping: "); for (count = 0; count < writecnt; count++) { cmd = mmio_readb(sb600_spibar + 0xC); - msg_pspew("[ %2x]", cmd); + msg_pspew("[%02x]", cmd); } + msg_pspew("\n"); + reset_compare_internal_fifo_pointer(writecnt);
- msg_pspew("The FIFO pointer after skipping is %d.\n", - mmio_readb(sb600_spibar + 0xd) & 0x07); + msg_pspew("Reading: "); for (count = 0; count < readcnt; count++, readarr++) { *readarr = mmio_readb(sb600_spibar + 0xC); msg_pspew("[%02x]", *readarr); } msg_pspew("\n"); + reset_compare_internal_fifo_pointer(readcnt + writecnt);
+ if (mmio_readb(sb600_spibar + 1) != readwrite) { + msg_perr("Unexpected change in SB600 read/write count! " + "Something else is accessing the flash chip and " + "causes random corruption. Please stop all " + "applications and drivers which access the flash " + "chip.\n"); + /* FIXME: Abort here? */ + } + return 0; }
@@ -158,6 +185,10 @@ struct pci_dev *smbus_dev; uint32_t tmp; uint8_t reg; + const char *speed_names[4] = { + "Reserved", "33", "22", "16.5" + }; + /* Read SPI_BaseAddr */ tmp = pci_read_long(dev, 0xa0); tmp &= 0xffffffe0; /* remove bits 4-0 (reserved) */ @@ -183,15 +214,25 @@ msg_pdbg("PrefetchEnSPIFromIMC=%i, ", tmp);
tmp = pci_read_byte(dev, 0xbb); + /* FIXME: Set bit 3,6,7 if not already set. + * Set bit 5, otherwise SPI accesses are pointless in LPC mode. + * See doc 42413 AMD SB700/710/750 RPR. + */ msg_pdbg("PrefetchEnSPIFromHost=%i, SpiOpEnInLpcMode=%i\n", tmp & 0x1, (tmp & 0x20) >> 5); tmp = mmio_readl(sb600_spibar); + /* FIXME: If SpiAccessMacRomEn or SpiHostAccessRomEn are zero on + * SB700 or later, reads and writes will be corrupted. Abort in this + * case. Make sure to avoid this check on SB600. + */ msg_pdbg("SpiArbEnable=%i, SpiAccessMacRomEn=%i, " "SpiHostAccessRomEn=%i, ArbWaitCount=%i, " "SpiBridgeDisable=%i, DropOneClkOnRd=%i\n", (tmp >> 19) & 0x1, (tmp >> 22) & 0x1, (tmp >> 23) & 0x1, (tmp >> 24) & 0x7, (tmp >> 27) & 0x1, (tmp >> 28) & 0x1); + tmp = (mmio_readb(sb600_spibar + 0xd) >> 4) & 0x3; + msg_pdbg("NormSpeed is %s MHz\n", speed_names[tmp]);
/* Look for the SMBus device. */ smbus_dev = pci_dev_find(0x1002, 0x4385);
Hi Matthias,
thanks, we're making a lot of progress.
On 18.08.2010 13:31, Carl-Daniel Hailfinger wrote:
Hi Matthias,
thanks for the test. Can you repeat the test with the following patch?
On 18.08.2010 13:04, Carl-Daniel Hailfinger wrote:
Hi Matthias, hi Hony,
you are both seeing a very similar issue. This mail contains a patch which should add the debugging we need.
New patch, with additional error checks.
Add paranoid checks for the essential readcnt/writecnt registers in the SB600/SB700/... SPI driver. This will detect most concurrent access, but not all.
I would love to see the logs in double verbose mode for a read:
flashrom -VV -r broken_fifo1.bin >broken_fifo1.txt 2>&1
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Index: flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c =================================================================== --- flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c (Revision 1144) +++ flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c (Arbeitskopie) @@ -58,10 +58,36 @@ { mmio_writeb(mmio_readb(sb600_spibar + 2) | 0x10, sb600_spibar + 2);
+ /* FIXME: This loop makes no sense at all. */ while (mmio_readb(sb600_spibar + 0xD) & 0x7) msg_pspew("reset\n"); }
+static int compare_internal_fifo_pointer(uint8_t want) +{ + uint8_t tmp; + + tmp = mmio_readb(sb600_spibar + 0xd) & 0x07; + if ((want & 0x7) != tmp) { + msg_perr("SB600 FIFO pointer corruption! Pointer is %d, wanted " + "%d\nPlease make sure that nothing else accesses the " + "flash chip at the same time.\n", tmp, want); + return 1; + } else { + msg_pspew("SB600 FIFO pointer is %d, wanted %d\n", tmp, want); + return 0; + } +} + +static int reset_compare_internal_fifo_pointer(uint8_t want) +{ + int ret; + + ret = compare_internal_fifo_pointer(want); + reset_internal_fifo_pointer(); + return ret; +} + static void execute_command(void) { mmio_writeb(mmio_readb(sb600_spibar + 2) | 1, sb600_spibar + 2); @@ -77,6 +103,7 @@ /* First byte is cmd which can not being sent through FIFO. */ unsigned char cmd = *writearr++; unsigned int readoffby1; + unsigned char readwrite;
writecnt--;
@@ -102,15 +129,17 @@ * It is unclear if the CS# line is set high too early as well. */ readoffby1 = (writecnt) ? 0 : 1; - mmio_writeb((readcnt + readoffby1) << 4 | (writecnt), sb600_spibar + 1); + readwrite = (readcnt + readoffby1) << 4 | (writecnt); + mmio_writeb(readwrite, sb600_spibar + 1); mmio_writeb(cmd, sb600_spibar + 0);
/* Before we use the FIFO, reset it first. */ reset_internal_fifo_pointer();
/* Send the write byte to FIFO. */ + msg_pspew("Writing: "); for (count = 0; count < writecnt; count++, writearr++) { - msg_pspew(" [%x]", *writearr); + msg_pspew("[%02x]", *writearr); mmio_writeb(*writearr, sb600_spibar + 0xC); } msg_pspew("\n"); @@ -119,8 +148,9 @@ * We should send the data by sequence, which means we need to reset * the FIFO pointer to the first byte we want to send. */ - reset_internal_fifo_pointer(); + reset_compare_internal_fifo_pointer(writecnt);
+ msg_pspew("Executing: \n"); execute_command();
/* @@ -134,22 +164,34 @@ * the opcode, the FIFO already stores the response from the chip. * Usually, the chip will respond with 0x00 or 0xff. */ - reset_internal_fifo_pointer(); + reset_compare_internal_fifo_pointer(writecnt + readcnt);
/* Skip the bytes we sent. */ + msg_pspew("Skipping: "); for (count = 0; count < writecnt; count++) { cmd = mmio_readb(sb600_spibar + 0xC); - msg_pspew("[ %2x]", cmd); + msg_pspew("[%02x]", cmd); } + msg_pspew("\n"); + compare_internal_fifo_pointer(writecnt);
- msg_pspew("The FIFO pointer after skipping is %d.\n", - mmio_readb(sb600_spibar + 0xd) & 0x07); + msg_pspew("Reading: "); for (count = 0; count < readcnt; count++, readarr++) { *readarr = mmio_readb(sb600_spibar + 0xC); msg_pspew("[%02x]", *readarr); } msg_pspew("\n"); + reset_compare_internal_fifo_pointer(readcnt + writecnt);
+ if (mmio_readb(sb600_spibar + 1) != readwrite) { + msg_perr("Unexpected change in SB600 read/write count! " + "Something else is accessing the flash chip and " + "causes random corruption. Please stop all " + "applications and drivers which access the flash " + "chip.\n"); + /* FIXME: Abort here? */ + } + return 0; }
@@ -158,6 +200,10 @@ struct pci_dev *smbus_dev; uint32_t tmp; uint8_t reg; + const char *speed_names[4] = { + "Reserved", "33", "22", "16.5" + }; + /* Read SPI_BaseAddr */ tmp = pci_read_long(dev, 0xa0); tmp &= 0xffffffe0; /* remove bits 4-0 (reserved) */ @@ -183,15 +229,25 @@ msg_pdbg("PrefetchEnSPIFromIMC=%i, ", tmp);
tmp = pci_read_byte(dev, 0xbb); + /* FIXME: Set bit 3,6,7 if not already set. + * Set bit 5, otherwise SPI accesses are pointless in LPC mode. + * See doc 42413 AMD SB700/710/750 RPR. + */ msg_pdbg("PrefetchEnSPIFromHost=%i, SpiOpEnInLpcMode=%i\n", tmp & 0x1, (tmp & 0x20) >> 5); tmp = mmio_readl(sb600_spibar); + /* FIXME: If SpiAccessMacRomEn or SpiHostAccessRomEn are zero on + * SB700 or later, reads and writes will be corrupted. Abort in this + * case. Make sure to avoid this check on SB600. + */ msg_pdbg("SpiArbEnable=%i, SpiAccessMacRomEn=%i, " "SpiHostAccessRomEn=%i, ArbWaitCount=%i, " "SpiBridgeDisable=%i, DropOneClkOnRd=%i\n", (tmp >> 19) & 0x1, (tmp >> 22) & 0x1, (tmp >> 23) & 0x1, (tmp >> 24) & 0x7, (tmp >> 27) & 0x1, (tmp >> 28) & 0x1); + tmp = (mmio_readb(sb600_spibar + 0xd) >> 4) & 0x3; + msg_pdbg("NormSpeed is %s MHz\n", speed_names[tmp]);
/* Look for the SMBus device. */ smbus_dev = pci_dev_find(0x1002, 0x4385);
Hi Matthias,
this is the last version of the corruption detection. flashrom will now abort if it detects corruption instead of continuing.
On 18.08.2010 13:58, Carl-Daniel Hailfinger wrote:
On 18.08.2010 13:31, Carl-Daniel Hailfinger wrote:
Hi Matthias,
thanks for the test. Can you repeat the test with the following patch?
On 18.08.2010 13:04, Carl-Daniel Hailfinger wrote:
Hi Matthias, hi Hony,
you are both seeing a very similar issue. This mail contains a patch which should add the debugging we need.
New patch, with additional error checks.
Add paranoid checks for the essential readcnt/writecnt registers in the SB600/SB700/... SPI driver. This will detect most concurrent access, but not all.
I would love to see the logs in double verbose mode for a read:
flashrom -VV -r broken_fifo1.bin >broken_fifo1.txt 2>&1
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
If this aborts for you with an error message about corruption and if the read image is empty (or not created), the patch does what it should. In that case, please respond with
Acked-by: Your Name your@email
so I can check in the patch and work with you to have flashrom retry corrupted accesses.
Index: flashrom-sb600_spi_paranoia_concurrent_access/spi.h =================================================================== --- flashrom-sb600_spi_paranoia_concurrent_access/spi.h (Revision 1144) +++ flashrom-sb600_spi_paranoia_concurrent_access/spi.h (Arbeitskopie) @@ -124,5 +124,6 @@ #define SPI_INVALID_ADDRESS -3 #define SPI_INVALID_LENGTH -4 #define SPI_FLASHROM_BUG -5 +#define SPI_PROGRAMMER_ERROR -6
#endif /* !__SPI_H__ */ Index: flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c =================================================================== --- flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c (Revision 1144) +++ flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c (Arbeitskopie) @@ -58,10 +58,40 @@ { mmio_writeb(mmio_readb(sb600_spibar + 2) | 0x10, sb600_spibar + 2);
+ /* FIXME: This loop makes no sense at all. */ while (mmio_readb(sb600_spibar + 0xD) & 0x7) msg_pspew("reset\n"); }
+static int compare_internal_fifo_pointer(uint8_t want) +{ + uint8_t tmp; + + tmp = mmio_readb(sb600_spibar + 0xd) & 0x07; + want &= 0x7; + if (want != tmp) { + msg_perr("SB600 FIFO pointer corruption! Pointer is %d, wanted " + "%d\n", tmp, want); + msg_perr("Something else is accessing the flash chip and " + "causes random corruption.\nPlease stop all " + "applications and drivers and IPMI which access the " + "flash chip.\n"); + return 1; + } else { + msg_pspew("SB600 FIFO pointer is %d, wanted %d\n", tmp, want); + return 0; + } +} + +static int reset_compare_internal_fifo_pointer(uint8_t want) +{ + int ret; + + ret = compare_internal_fifo_pointer(want); + reset_internal_fifo_pointer(); + return ret; +} + static void execute_command(void) { mmio_writeb(mmio_readb(sb600_spibar + 2) | 1, sb600_spibar + 2); @@ -77,6 +107,7 @@ /* First byte is cmd which can not being sent through FIFO. */ unsigned char cmd = *writearr++; unsigned int readoffby1; + unsigned char readwrite;
writecnt--;
@@ -102,15 +133,17 @@ * It is unclear if the CS# line is set high too early as well. */ readoffby1 = (writecnt) ? 0 : 1; - mmio_writeb((readcnt + readoffby1) << 4 | (writecnt), sb600_spibar + 1); + readwrite = (readcnt + readoffby1) << 4 | (writecnt); + mmio_writeb(readwrite, sb600_spibar + 1); mmio_writeb(cmd, sb600_spibar + 0);
/* Before we use the FIFO, reset it first. */ reset_internal_fifo_pointer();
/* Send the write byte to FIFO. */ + msg_pspew("Writing: "); for (count = 0; count < writecnt; count++, writearr++) { - msg_pspew(" [%x]", *writearr); + msg_pspew("[%02x]", *writearr); mmio_writeb(*writearr, sb600_spibar + 0xC); } msg_pspew("\n"); @@ -119,8 +152,10 @@ * We should send the data by sequence, which means we need to reset * the FIFO pointer to the first byte we want to send. */ - reset_internal_fifo_pointer(); + if (reset_compare_internal_fifo_pointer(writecnt)) + return SPI_PROGRAMMER_ERROR;
+ msg_pspew("Executing: \n"); execute_command();
/* @@ -134,22 +169,37 @@ * the opcode, the FIFO already stores the response from the chip. * Usually, the chip will respond with 0x00 or 0xff. */ - reset_internal_fifo_pointer(); + if (reset_compare_internal_fifo_pointer(writecnt + readcnt)) + return SPI_PROGRAMMER_ERROR;
/* Skip the bytes we sent. */ + msg_pspew("Skipping: "); for (count = 0; count < writecnt; count++) { cmd = mmio_readb(sb600_spibar + 0xC); - msg_pspew("[ %2x]", cmd); + msg_pspew("[%02x]", cmd); } + msg_pspew("\n"); + if (compare_internal_fifo_pointer(writecnt)) + return SPI_PROGRAMMER_ERROR;
- msg_pspew("The FIFO pointer after skipping is %d.\n", - mmio_readb(sb600_spibar + 0xd) & 0x07); + msg_pspew("Reading: "); for (count = 0; count < readcnt; count++, readarr++) { *readarr = mmio_readb(sb600_spibar + 0xC); msg_pspew("[%02x]", *readarr); } msg_pspew("\n"); + if (reset_compare_internal_fifo_pointer(readcnt + writecnt)) + return SPI_PROGRAMMER_ERROR;
+ if (mmio_readb(sb600_spibar + 1) != readwrite) { + msg_perr("Unexpected change in SB600 read/write count!\n"); + msg_perr("Something else is accessing the flash chip and " + "causes random corruption.\nPlease stop all " + "applications and drivers and IPMI which access the " + "flash chip.\n"); + return SPI_PROGRAMMER_ERROR; + } + return 0; }
@@ -158,6 +208,10 @@ struct pci_dev *smbus_dev; uint32_t tmp; uint8_t reg; + const char *speed_names[4] = { + "Reserved", "33", "22", "16.5" + }; + /* Read SPI_BaseAddr */ tmp = pci_read_long(dev, 0xa0); tmp &= 0xffffffe0; /* remove bits 4-0 (reserved) */ @@ -183,15 +237,25 @@ msg_pdbg("PrefetchEnSPIFromIMC=%i, ", tmp);
tmp = pci_read_byte(dev, 0xbb); + /* FIXME: Set bit 3,6,7 if not already set. + * Set bit 5, otherwise SPI accesses are pointless in LPC mode. + * See doc 42413 AMD SB700/710/750 RPR. + */ msg_pdbg("PrefetchEnSPIFromHost=%i, SpiOpEnInLpcMode=%i\n", tmp & 0x1, (tmp & 0x20) >> 5); tmp = mmio_readl(sb600_spibar); + /* FIXME: If SpiAccessMacRomEn or SpiHostAccessRomEn are zero on + * SB700 or later, reads and writes will be corrupted. Abort in this + * case. Make sure to avoid this check on SB600. + */ msg_pdbg("SpiArbEnable=%i, SpiAccessMacRomEn=%i, " "SpiHostAccessRomEn=%i, ArbWaitCount=%i, " "SpiBridgeDisable=%i, DropOneClkOnRd=%i\n", (tmp >> 19) & 0x1, (tmp >> 22) & 0x1, (tmp >> 23) & 0x1, (tmp >> 24) & 0x7, (tmp >> 27) & 0x1, (tmp >> 28) & 0x1); + tmp = (mmio_readb(sb600_spibar + 0xd) >> 4) & 0x3; + msg_pdbg("NormSpeed is %s MHz\n", speed_names[tmp]);
/* Look for the SMBus device. */ smbus_dev = pci_dev_find(0x1002, 0x4385); @@ -230,6 +294,9 @@ return 0; }
+ /* Bring the FIFO to a clean state. */ + reset_internal_fifo_pointer(); + buses_supported |= CHIP_BUSTYPE_SPI; spi_controller = SPI_CONTROLLER_SB600; return 0;
Acked-by: Matthias Kretz kretz@kde.org
On Wednesday 18 August 2010 14:26:39 Carl-Daniel Hailfinger wrote:
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
If this aborts for you with an error message about corruption and if the read image is empty (or not created), the patch does what it should. In that case, please respond with
Acked-by: Your Name your@email
so I can check in the patch and work with you to have flashrom retry corrupted accesses.
Index: flashrom-sb600_spi_paranoia_concurrent_access/spi.h
--- flashrom-sb600_spi_paranoia_concurrent_access/spi.h (Revision 1144) +++ flashrom-sb600_spi_paranoia_concurrent_access/spi.h (Arbeitskopie) @@ -124,5 +124,6 @@ #define SPI_INVALID_ADDRESS -3 #define SPI_INVALID_LENGTH -4 #define SPI_FLASHROM_BUG -5 +#define SPI_PROGRAMMER_ERROR -6
#endif /* !__SPI_H__ */ Index: flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c =================================================================== --- flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c (Revision 1144) +++ flashrom-sb600_spi_paranoia_concurrent_access/sb600spi.c (Arbeitskopie) @@ -58,10 +58,40 @@ { mmio_writeb(mmio_readb(sb600_spibar + 2) | 0x10, sb600_spibar + 2);
- /* FIXME: This loop makes no sense at all. */ while (mmio_readb(sb600_spibar + 0xD) & 0x7) msg_pspew("reset\n");
}
+static int compare_internal_fifo_pointer(uint8_t want) +{
- uint8_t tmp;
- tmp = mmio_readb(sb600_spibar + 0xd) & 0x07;
- want &= 0x7;
- if (want != tmp) {
msg_perr("SB600 FIFO pointer corruption! Pointer is %d, wanted "
"%d\n", tmp, want);
msg_perr("Something else is accessing the flash chip and "
"causes random corruption.\nPlease stop all "
"applications and drivers and IPMI which access the "
"flash chip.\n");
return 1;
- } else {
msg_pspew("SB600 FIFO pointer is %d, wanted %d\n", tmp, want);
return 0;
- }
+}
+static int reset_compare_internal_fifo_pointer(uint8_t want) +{
- int ret;
- ret = compare_internal_fifo_pointer(want);
- reset_internal_fifo_pointer();
- return ret;
+}
static void execute_command(void) { mmio_writeb(mmio_readb(sb600_spibar + 2) | 1, sb600_spibar + 2); @@ -77,6 +107,7 @@ /* First byte is cmd which can not being sent through FIFO. */ unsigned char cmd = *writearr++; unsigned int readoffby1;
unsigned char readwrite;
writecnt--;
@@ -102,15 +133,17 @@ * It is unclear if the CS# line is set high too early as well. */ readoffby1 = (writecnt) ? 0 : 1;
- mmio_writeb((readcnt + readoffby1) << 4 | (writecnt), sb600_spibar + 1);
readwrite = (readcnt + readoffby1) << 4 | (writecnt);
mmio_writeb(readwrite, sb600_spibar + 1); mmio_writeb(cmd, sb600_spibar + 0);
/* Before we use the FIFO, reset it first. */ reset_internal_fifo_pointer();
/* Send the write byte to FIFO. */
msg_pspew("Writing: "); for (count = 0; count < writecnt; count++, writearr++) {
msg_pspew(" [%x]", *writearr);
mmio_writeb(*writearr, sb600_spibar + 0xC); } msg_pspew("\n");msg_pspew("[%02x]", *writearr);
@@ -119,8 +152,10 @@ * We should send the data by sequence, which means we need to reset * the FIFO pointer to the first byte we want to send. */
- reset_internal_fifo_pointer();
if (reset_compare_internal_fifo_pointer(writecnt))
return SPI_PROGRAMMER_ERROR;
msg_pspew("Executing: \n"); execute_command();
/*
@@ -134,22 +169,37 @@ * the opcode, the FIFO already stores the response from the chip. * Usually, the chip will respond with 0x00 or 0xff. */
- reset_internal_fifo_pointer();
if (reset_compare_internal_fifo_pointer(writecnt + readcnt))
return SPI_PROGRAMMER_ERROR;
/* Skip the bytes we sent. */
msg_pspew("Skipping: "); for (count = 0; count < writecnt; count++) { cmd = mmio_readb(sb600_spibar + 0xC);
msg_pspew("[ %2x]", cmd);
}msg_pspew("[%02x]", cmd);
- msg_pspew("\n");
- if (compare_internal_fifo_pointer(writecnt))
return SPI_PROGRAMMER_ERROR;
- msg_pspew("The FIFO pointer after skipping is %d.\n",
mmio_readb(sb600_spibar + 0xd) & 0x07);
msg_pspew("Reading: "); for (count = 0; count < readcnt; count++, readarr++) { *readarr = mmio_readb(sb600_spibar + 0xC); msg_pspew("[%02x]", *readarr); } msg_pspew("\n");
if (reset_compare_internal_fifo_pointer(readcnt + writecnt))
return SPI_PROGRAMMER_ERROR;
if (mmio_readb(sb600_spibar + 1) != readwrite) {
msg_perr("Unexpected change in SB600 read/write count!\n");
msg_perr("Something else is accessing the flash chip and "
"causes random corruption.\nPlease stop all "
"applications and drivers and IPMI which access the "
"flash chip.\n");
return SPI_PROGRAMMER_ERROR;
}
return 0;
}
@@ -158,6 +208,10 @@ struct pci_dev *smbus_dev; uint32_t tmp; uint8_t reg;
- const char *speed_names[4] = {
"Reserved", "33", "22", "16.5"
- };
- /* Read SPI_BaseAddr */ tmp = pci_read_long(dev, 0xa0); tmp &= 0xffffffe0; /* remove bits 4-0 (reserved) */
@@ -183,15 +237,25 @@ msg_pdbg("PrefetchEnSPIFromIMC=%i, ", tmp);
tmp = pci_read_byte(dev, 0xbb);
/* FIXME: Set bit 3,6,7 if not already set.
* Set bit 5, otherwise SPI accesses are pointless in LPC mode.
* See doc 42413 AMD SB700/710/750 RPR.
*/
msg_pdbg("PrefetchEnSPIFromHost=%i, SpiOpEnInLpcMode=%i\n", tmp & 0x1, (tmp & 0x20) >> 5); tmp = mmio_readl(sb600_spibar);
/* FIXME: If SpiAccessMacRomEn or SpiHostAccessRomEn are zero on
* SB700 or later, reads and writes will be corrupted. Abort in this
* case. Make sure to avoid this check on SB600.
*/
msg_pdbg("SpiArbEnable=%i, SpiAccessMacRomEn=%i, " "SpiHostAccessRomEn=%i, ArbWaitCount=%i, " "SpiBridgeDisable=%i, DropOneClkOnRd=%i\n", (tmp >> 19) & 0x1, (tmp >> 22) & 0x1, (tmp >> 23) & 0x1, (tmp >> 24) & 0x7, (tmp >> 27) & 0x1, (tmp >> 28) & 0x1);
tmp = (mmio_readb(sb600_spibar + 0xd) >> 4) & 0x3;
msg_pdbg("NormSpeed is %s MHz\n", speed_names[tmp]);
/* Look for the SMBus device. */ smbus_dev = pci_dev_find(0x1002, 0x4385);
@@ -230,6 +294,9 @@ return 0; }
- /* Bring the FIFO to a clean state. */
- reset_internal_fifo_pointer();
- buses_supported |= CHIP_BUSTYPE_SPI; spi_controller = SPI_CONTROLLER_SB600; return 0;
On 18.08.2010 15:28, Matthias Kretz wrote:
On Wednesday 18 August 2010 14:26:39 Carl-Daniel Hailfinger wrote:
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Acked-by: Matthias Kretz kretz@kde.org
Thanks, committed in r1145.
Regards, Carl-Daniel
Hi Daniel,
Thanks for your input. Whatever, it must be to run this 3rd application to call up flashrom for BIOS update according to our norm. I would like to do flashrom enhancement for defect fix. Q1:Is it possible according to your view? Q2: I have examined that physical address space with hardware ICE, but data on these addresses is not changed once 3rd application is running. May I ensure that it does not access this physical address space used by SPI controller? Q3: Would you have any idea to lock mapped virtual address or /dev/mem to prevent access by other process?
Best Regards, Hony
-----Original Message----- From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: Friday, August 13, 2010 9:26 AM To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
my suspicion is that either the third party application may access flash regions between 0xFFC00000 and 0xFFFFFFFF or that the builtin management engine in the southbridge may be asked by the third party application to access those flash regions or run those commands.
You wrote that the problem does not exist if the third party application (IBM UpdateXpress System Pack Installer) is not running. I see three ways to handle this: - Always terminate UpdateXpress before running flashrom. - Have flashrom detect a running UpdateXpress instance and freeze it as long as it accesses flash. - Ask IBM to modify UpdateXpress in a way that does not access flash unless explicitly requested.
As an alternative, we could add paranoid checks to the SB600 SPI driver and hope that those checks will help detect the issue. Once we detect the issue, we can print a warning and tell the user to stop all other programs accessing the flash.
I will send a SB600 paranoid checks patch as reply to this mail.
Regards, Carl-Daniel
On 12.08.2010 09:19, Hony.Chiang@mic.com.tw wrote:
Hi Daniel,
Here is verbose log about SPI registers for reference.
Best Regards,
Hony
From: hony.chiang (江昆仲 - MIC) Sent: Wednesday, August 11, 2010 11:23 PM To: Carl-Daniel Hailfinger Cc: flashrom@flashrom.org; hony.chiang (江昆仲 - MIC) Subject: RE: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Daniel,
Thanks for your quick response. Please kindly see my answers with blue texts as below and share you my experiments today.
I use AMD HDT hardware ICE to interrupt CPU for examining data on physical address 0xFEC10000 ~ 0xFEC100F used by SPI host controller registers today. They are changed once flashrom is running, and I can also capture SPI opcode commands on 0xFEC10000 address. However, they are fixed even though 3rd application is running. It means that registers are not overwritten by this application. I have added code to set process prority -20 in flashrom, but the failure is still there.
Best regards,
Hony
From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: 2010/8/11 [星期三] 下午 09:05 To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
we will help you.
On 11.08.2010 04:27, Hony.Chiang@mic.com.tw wrote:
I get the serious problem on flashrom 0.9.2 on our server systems. We always have to run 3rd party manager application to call up flashrom for BIOS upgrade.
Can you tell us the name of the manager application you are using?
UpdateXpress System Pack Installer through a graphical user interface (GUI). http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp?topic=/uxsp...
However, flashrom would be failed on “Verifying flash” step because some of programmed data in flash part is different with ones in golden ROM image file on random offset addresses. This defect occurs on Red Hat 5 x64, SLES 10 x64 and SLES 11 x64. If I run flashrom tool manually without this 3rd application, it would always flash SPI part successfully.
This is good. It means that flashrom works fine if nothing else accesses the flash chip.
Due to the 3rd party application is confidential, it can not be provided to you for defect reproduction.
Q1: May I suspect that expected data that is written to mapped virtual memory space by flashrom is overwritten or interfered by other process from 3rd application before SPI host controller operates these data to SPI ROM?
Q2: May I suspect that expected data that is written to mapped virtual memory space by flashrom is not sync to physical memory space immediately when SPI host controller operates these data to SPI ROM?
flashrom does not write to the mapped memory space of the flash chip on SP5100. flashrom uses only the SPI host controller registers to read/write. If any other software accesses (read or write) the mapped memory space of the flash chip or the SPI host controller registers at the same time, the SPI host controller registers will change in unexpected ways. This can lead to corruption of the flash chip or corruption of reads.
Q3: In order to prevent other process to interfere /dev/mem, mapped virtual memory space or physical memory space that flashrom uses, would you have any idea or some slice codes to lock them during flashrom operation?
Locking under most operating systems is not mandatory, and this means even if flashrom asks for a lock, all other applications can ignore the lock. I don't know any application which does locking on /dev/mem because that might interfere with X.org graphics and other software.
Q4: Would you have any idea to trace the root cause interfered flashrom?
I wrote a patch which should be able to detect interference from other applications. I will send it later once it is tested.
Thank you. I will try it to give you more inputs.
Verbose message: Programming flash done. COMPLETE. Verifying flash... VERIFY FAILED at 0x00017ca4! Expected=0xff, Read=0x25, failed byte count from 0x00000000-0x003fffff: 0x3a
Syntax: ./flashrom –w 24a.rom
Hardware configuration: MB: AMD platform solution South Bridge: AMD SP5100 Flash part: ST M25P32
NOS: Reg Hat 5 X86_X64 SLES 10 X86_X64 SLES 11 X86_X64
Could you please send the output of "flashrom -V" so I can check the contents of some SP5100 registers? Thanks. I will capture message with -V parameter on my lab tomorrow, and will provide you output. Regards, Carl-Daniel
flashrom mailing list flashrom@flashrom.org http://www.flashrom.org/mailman/listinfo/flashrom
Hi Hony,
Q1: If the third party application must call flashrom, then the only way to fix this is to have flashrom kill the third party application with SIGSTOP and send SIGCONT once flashrom has finished. That should be safe, but it also means that the third party application will be completely frozen while flashrom runs.
Q2: If you look at 0xFFC00000-0xFFFFFFFF with the hardware ICE, can you check that nothing accesses (read/write) the region while flashrom is running? flashrom will not access the region, and the third party application should not access the region because it will result in corruption.
Q3: I think safe locking for /dev/mem is impossible without a new kernel driver. If you can lock /dev/mem completely, Xorg will stop working and you will not have a graphical interface any more.
Does UpdateXpress use flashrom by default, or did you supply your own configuration and tell it to use flashrom?
Please also see my mail with subject: [flashrom] [PATCH] SB600 SPI paranoid checks
It has a patch and some testing instructions.
Regards, Carl-Daniel
On 13.08.2010 04:04, Hony.Chiang@mic.com.tw wrote:
Hi Daniel,
Thanks for your input. Whatever, it must be to run this 3rd application to call up flashrom for BIOS update according to our norm. I would like to do flashrom enhancement for defect fix. Q1:Is it possible according to your view? Q2: I have examined that physical address space with hardware ICE, but data on these addresses is not changed once 3rd application is running. May I ensure that it does not access this physical address space used by SPI controller? Q3: Would you have any idea to lock mapped virtual address or /dev/mem to prevent access by other process?
Best Regards, Hony
-----Original Message----- From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: Friday, August 13, 2010 9:26 AM To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
my suspicion is that either the third party application may access flash regions between 0xFFC00000 and 0xFFFFFFFF or that the builtin management engine in the southbridge may be asked by the third party application to access those flash regions or run those commands.
You wrote that the problem does not exist if the third party application (IBM UpdateXpress System Pack Installer) is not running. I see three ways to handle this:
- Always terminate UpdateXpress before running flashrom.
- Have flashrom detect a running UpdateXpress instance and freeze it as
long as it accesses flash.
- Ask IBM to modify UpdateXpress in a way that does not access flash
unless explicitly requested.
As an alternative, we could add paranoid checks to the SB600 SPI driver and hope that those checks will help detect the issue. Once we detect the issue, we can print a warning and tell the user to stop all other programs accessing the flash.
I will send a SB600 paranoid checks patch as reply to this mail.
Regards, Carl-Daniel
On 12.08.2010 09:19, Hony.Chiang@mic.com.tw wrote:
Hi Daniel,
Here is verbose log about SPI registers for reference.
Best Regards,
Hony
From: hony.chiang (江昆仲 - MIC) Sent: Wednesday, August 11, 2010 11:23 PM To: Carl-Daniel Hailfinger Cc: flashrom@flashrom.org; hony.chiang (江昆仲 - MIC) Subject: RE: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Daniel,
Thanks for your quick response. Please kindly see my answers with blue texts as below and share you my experiments today.
I use AMD HDT hardware ICE to interrupt CPU for examining data on physical address 0xFEC10000 ~ 0xFEC100F used by SPI host controller registers today. They are changed once flashrom is running, and I can also capture SPI opcode commands on 0xFEC10000 address. However, they are fixed even though 3rd application is running. It means that registers are not overwritten by this application. I have added code to set process prority -20 in flashrom, but the failure is still there.
Best regards,
Hony
From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: 2010/8/11 [星期三] 下午 09:05 To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
we will help you.
On 11.08.2010 04:27, Hony.Chiang@mic.com.tw wrote:
I get the serious problem on flashrom 0.9.2 on our server systems. We always have to run 3rd party manager application to call up flashrom for BIOS upgrade.
Can you tell us the name of the manager application you are using?
UpdateXpress System Pack Installer through a graphical user interface (GUI). http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp?topic=/uxsp...
However, flashrom would be failed on “Verifying flash” step because some of programmed data in flash part is different with ones in golden ROM image file on random offset addresses. This defect occurs on Red Hat 5 x64, SLES 10 x64 and SLES 11 x64. If I run flashrom tool manually without this 3rd application, it would always flash SPI part successfully.
This is good. It means that flashrom works fine if nothing else accesses the flash chip.
Due to the 3rd party application is confidential, it can not be provided to you for defect reproduction.
Q1: May I suspect that expected data that is written to mapped virtual memory space by flashrom is overwritten or interfered by other process from 3rd application before SPI host controller operates these data to SPI ROM?
Q2: May I suspect that expected data that is written to mapped virtual memory space by flashrom is not sync to physical memory space immediately when SPI host controller operates these data to SPI ROM?
flashrom does not write to the mapped memory space of the flash chip on SP5100. flashrom uses only the SPI host controller registers to read/write. If any other software accesses (read or write) the mapped memory space of the flash chip or the SPI host controller registers at the same time, the SPI host controller registers will change in unexpected ways. This can lead to corruption of the flash chip or corruption of reads.
Q3: In order to prevent other process to interfere /dev/mem, mapped virtual memory space or physical memory space that flashrom uses, would you have any idea or some slice codes to lock them during flashrom operation?
Locking under most operating systems is not mandatory, and this means even if flashrom asks for a lock, all other applications can ignore the lock. I don't know any application which does locking on /dev/mem because that might interfere with X.org graphics and other software.
Q4: Would you have any idea to trace the root cause interfered flashrom?
I wrote a patch which should be able to detect interference from other applications. I will send it later once it is tested.
Thank you. I will try it to give you more inputs.
Verbose message: Programming flash done. COMPLETE. Verifying flash... VERIFY FAILED at 0x00017ca4! Expected=0xff, Read=0x25, failed byte count from 0x00000000-0x003fffff: 0x3a
Syntax: ./flashrom –w 24a.rom
Hardware configuration: MB: AMD platform solution South Bridge: AMD SP5100 Flash part: ST M25P32
NOS: Reg Hat 5 X86_X64 SLES 10 X86_X64 SLES 11 X86_X64
Could you please send the output of "flashrom -V" so I can check the contents of some SP5100 registers? Thanks. I will capture message with -V parameter on my lab tomorrow, and will provide you output. Regards, Carl-Daniel
flashrom mailing list flashrom@flashrom.org http://www.flashrom.org/mailman/listinfo/flashrom
Hi Daniel,
Sorry to trouble you again.
A1: This application handles all of update processes including capturing return code and post message from flashrom, and report update result to its UI for user understanding finally. This idea can not be implemented.
A2: I just look at 0xFEC10000 ~ 0xFEC1000F (SPI host base address). Nothing access this region is by application, but is not by flashrom. As I know, SPI ROM just accepts SPI command and data for programming though SPI host base address. Even though writing 0xFFC00000-0xFFFFFFFF space should not impact data in SPI ROM. Am I right? My SLES 11 OS sometimes crashes once flashrom and application are running at the same time, and BIOS part is also damaged.
A3: We integrate flashrom to this application for calling by ourselves, not by default.
How can I see your mail with subject: [flashrom] [PATCH] SB600 SPI paranoid checks
Best Regards, Hony -----Original Message----- From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: Friday, August 13, 2010 10:22 AM To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
Q1: If the third party application must call flashrom, then the only way to fix this is to have flashrom kill the third party application with SIGSTOP and send SIGCONT once flashrom has finished. That should be safe, but it also means that the third party application will be completely frozen while flashrom runs.
Q2: If you look at 0xFFC00000-0xFFFFFFFF with the hardware ICE, can you check that nothing accesses (read/write) the region while flashrom is running? flashrom will not access the region, and the third party application should not access the region because it will result in corruption.
Q3: I think safe locking for /dev/mem is impossible without a new kernel driver. If you can lock /dev/mem completely, Xorg will stop working and you will not have a graphical interface any more.
Does UpdateXpress use flashrom by default, or did you supply your own configuration and tell it to use flashrom?
Please also see my mail with subject: [flashrom] [PATCH] SB600 SPI paranoid checks
It has a patch and some testing instructions.
Regards, Carl-Daniel
On 13.08.2010 04:04, Hony.Chiang@mic.com.tw wrote:
Hi Daniel,
Thanks for your input. Whatever, it must be to run this 3rd application to call up flashrom for BIOS update according to our norm. I would like to do flashrom enhancement for defect fix. Q1:Is it possible according to your view? Q2: I have examined that physical address space with hardware ICE, but data on these addresses is not changed once 3rd application is running. May I ensure that it does not access this physical address space used by SPI controller? Q3: Would you have any idea to lock mapped virtual address or /dev/mem to prevent access by other process?
Best Regards, Hony
-----Original Message----- From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: Friday, August 13, 2010 9:26 AM To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
my suspicion is that either the third party application may access flash regions between 0xFFC00000 and 0xFFFFFFFF or that the builtin management engine in the southbridge may be asked by the third party application to access those flash regions or run those commands.
You wrote that the problem does not exist if the third party application (IBM UpdateXpress System Pack Installer) is not running. I see three ways to handle this:
- Always terminate UpdateXpress before running flashrom.
- Have flashrom detect a running UpdateXpress instance and freeze it as
long as it accesses flash.
- Ask IBM to modify UpdateXpress in a way that does not access flash
unless explicitly requested.
As an alternative, we could add paranoid checks to the SB600 SPI driver and hope that those checks will help detect the issue. Once we detect the issue, we can print a warning and tell the user to stop all other programs accessing the flash.
I will send a SB600 paranoid checks patch as reply to this mail.
Regards, Carl-Daniel
On 12.08.2010 09:19, Hony.Chiang@mic.com.tw wrote:
Hi Daniel,
Here is verbose log about SPI registers for reference.
Best Regards,
Hony
From: hony.chiang (江昆仲 - MIC) Sent: Wednesday, August 11, 2010 11:23 PM To: Carl-Daniel Hailfinger Cc: flashrom@flashrom.org; hony.chiang (江昆仲 - MIC) Subject: RE: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Daniel,
Thanks for your quick response. Please kindly see my answers with blue texts as below and share you my experiments today.
I use AMD HDT hardware ICE to interrupt CPU for examining data on physical address 0xFEC10000 ~ 0xFEC100F used by SPI host controller registers today. They are changed once flashrom is running, and I can also capture SPI opcode commands on 0xFEC10000 address. However, they are fixed even though 3rd application is running. It means that registers are not overwritten by this application. I have added code to set process prority -20 in flashrom, but the failure is still there.
Best regards,
Hony
From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: 2010/8/11 [星期三] 下午 09:05 To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
we will help you.
On 11.08.2010 04:27, Hony.Chiang@mic.com.tw wrote:
I get the serious problem on flashrom 0.9.2 on our server systems. We always have to run 3rd party manager application to call up flashrom for BIOS upgrade.
Can you tell us the name of the manager application you are using?
UpdateXpress System Pack Installer through a graphical user interface (GUI). http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp?topic=/uxsp...
However, flashrom would be failed on “Verifying flash” step because some of programmed data in flash part is different with ones in golden ROM image file on random offset addresses. This defect occurs on Red Hat 5 x64, SLES 10 x64 and SLES 11 x64. If I run flashrom tool manually without this 3rd application, it would always flash SPI part successfully.
This is good. It means that flashrom works fine if nothing else accesses the flash chip.
Due to the 3rd party application is confidential, it can not be provided to you for defect reproduction.
Q1: May I suspect that expected data that is written to mapped virtual memory space by flashrom is overwritten or interfered by other process from 3rd application before SPI host controller operates these data to SPI ROM?
Q2: May I suspect that expected data that is written to mapped virtual memory space by flashrom is not sync to physical memory space immediately when SPI host controller operates these data to SPI ROM?
flashrom does not write to the mapped memory space of the flash chip on SP5100. flashrom uses only the SPI host controller registers to read/write. If any other software accesses (read or write) the mapped memory space of the flash chip or the SPI host controller registers at the same time, the SPI host controller registers will change in unexpected ways. This can lead to corruption of the flash chip or corruption of reads.
Q3: In order to prevent other process to interfere /dev/mem, mapped virtual memory space or physical memory space that flashrom uses, would you have any idea or some slice codes to lock them during flashrom operation?
Locking under most operating systems is not mandatory, and this means even if flashrom asks for a lock, all other applications can ignore the lock. I don't know any application which does locking on /dev/mem because that might interfere with X.org graphics and other software.
Q4: Would you have any idea to trace the root cause interfered flashrom?
I wrote a patch which should be able to detect interference from other applications. I will send it later once it is tested.
Thank you. I will try it to give you more inputs.
Verbose message: Programming flash done. COMPLETE. Verifying flash... VERIFY FAILED at 0x00017ca4! Expected=0xff, Read=0x25, failed byte count from 0x00000000-0x003fffff: 0x3a
Syntax: ./flashrom –w 24a.rom
Hardware configuration: MB: AMD platform solution South Bridge: AMD SP5100 Flash part: ST M25P32
NOS: Reg Hat 5 X86_X64 SLES 10 X86_X64 SLES 11 X86_X64
Could you please send the output of "flashrom -V" so I can check the contents of some SP5100 registers? Thanks. I will capture message with -V parameter on my lab tomorrow, and will provide you output. Regards, Carl-Daniel
flashrom mailing list flashrom@flashrom.org http://www.flashrom.org/mailman/listinfo/flashrom
Hi Hony,
A1: The "kill" command would only send a STOP signal to freeze the third-party application, not complete kill. When flashrom is done, it would send CONT signal with the "kill" command and the third-party application would unfreeze.
A2: If you read or write 0xFFC00000-0xFFFFFFFF, SPI registers will change automatically. If SPI registers change while flashrom reads/writes the flash chip, the result will be corrupt.
Do you know if any other firmware, IPMI or baseboard management software runs at the same time as flashrom? That could be a problem as well.
A crash is also possible if SMI/SMM is active while flashing and if SMM code lives in ROM.
I sent the mail with subject: [flashrom] [PATCH] SB600 SPI paranoid checks to your e-mail address, but you can also download the patch at http://patchwork.coreboot.org/patch/1737/raw/
Good luck!
Regards, Carl-Daniel
On 13.08.2010 05:09, Hony.Chiang@mic.com.tw wrote:
Hi Daniel,
Sorry to trouble you again.
A1: This application handles all of update processes including capturing return code and post message from flashrom, and report update result to its UI for user understanding finally. This idea can not be implemented.
A2: I just look at 0xFEC10000 ~ 0xFEC1000F (SPI host base address). Nothing access this region is by application, but is not by flashrom. As I know, SPI ROM just accepts SPI command and data for programming though SPI host base address. Even though writing 0xFFC00000-0xFFFFFFFF space should not impact data in SPI ROM. Am I right? My SLES 11 OS sometimes crashes once flashrom and application are running at the same time, and BIOS part is also damaged.
A3: We integrate flashrom to this application for calling by ourselves, not by default.
How can I see your mail with subject: [flashrom] [PATCH] SB600 SPI paranoid checks
Best Regards, Hony -----Original Message----- From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: Friday, August 13, 2010 10:22 AM To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
Q1: If the third party application must call flashrom, then the only way to fix this is to have flashrom kill the third party application with SIGSTOP and send SIGCONT once flashrom has finished. That should be safe, but it also means that the third party application will be completely frozen while flashrom runs.
Q2: If you look at 0xFFC00000-0xFFFFFFFF with the hardware ICE, can you check that nothing accesses (read/write) the region while flashrom is running? flashrom will not access the region, and the third party application should not access the region because it will result in corruption.
Q3: I think safe locking for /dev/mem is impossible without a new kernel driver. If you can lock /dev/mem completely, Xorg will stop working and you will not have a graphical interface any more.
Does UpdateXpress use flashrom by default, or did you supply your own configuration and tell it to use flashrom?
Please also see my mail with subject: [flashrom] [PATCH] SB600 SPI paranoid checks
It has a patch and some testing instructions.
Regards, Carl-Daniel
On 13.08.2010 04:04, Hony.Chiang@mic.com.tw wrote:
Hi Daniel,
Thanks for your input. Whatever, it must be to run this 3rd application to call up flashrom for BIOS update according to our norm. I would like to do flashrom enhancement for defect fix. Q1:Is it possible according to your view? Q2: I have examined that physical address space with hardware ICE, but data on these addresses is not changed once 3rd application is running. May I ensure that it does not access this physical address space used by SPI controller? Q3: Would you have any idea to lock mapped virtual address or /dev/mem to prevent access by other process?
Best Regards, Hony
-----Original Message----- From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: Friday, August 13, 2010 9:26 AM To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
my suspicion is that either the third party application may access flash regions between 0xFFC00000 and 0xFFFFFFFF or that the builtin management engine in the southbridge may be asked by the third party application to access those flash regions or run those commands.
You wrote that the problem does not exist if the third party application (IBM UpdateXpress System Pack Installer) is not running. I see three ways to handle this:
- Always terminate UpdateXpress before running flashrom.
- Have flashrom detect a running UpdateXpress instance and freeze it as
long as it accesses flash.
- Ask IBM to modify UpdateXpress in a way that does not access flash
unless explicitly requested.
As an alternative, we could add paranoid checks to the SB600 SPI driver and hope that those checks will help detect the issue. Once we detect the issue, we can print a warning and tell the user to stop all other programs accessing the flash.
I will send a SB600 paranoid checks patch as reply to this mail.
Regards, Carl-Daniel
On 12.08.2010 09:19, Hony.Chiang@mic.com.tw wrote:
Hi Daniel,
Here is verbose log about SPI registers for reference.
Best Regards,
Hony
From: hony.chiang (江昆仲 - MIC) Sent: Wednesday, August 11, 2010 11:23 PM To: Carl-Daniel Hailfinger Cc: flashrom@flashrom.org; hony.chiang (江昆仲 - MIC) Subject: RE: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Daniel,
Thanks for your quick response. Please kindly see my answers with blue texts as below and share you my experiments today.
I use AMD HDT hardware ICE to interrupt CPU for examining data on physical address 0xFEC10000 ~ 0xFEC100F used by SPI host controller registers today. They are changed once flashrom is running, and I can also capture SPI opcode commands on 0xFEC10000 address. However, they are fixed even though 3rd application is running. It means that registers are not overwritten by this application. I have added code to set process prority -20 in flashrom, but the failure is still there.
Best regards,
Hony
From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: 2010/8/11 [星期三] 下午 09:05 To: hony.chiang (江昆仲 - MIC) Cc: flashrom@flashrom.org Subject: Re: [flashrom] Urgent: Flash part would be destroied by flashrom 0.9.2 once manager application is running.
Hi Hony,
we will help you.
On 11.08.2010 04:27, Hony.Chiang@mic.com.tw wrote:
I get the serious problem on flashrom 0.9.2 on our server systems. We always have to run 3rd party manager application to call up flashrom for BIOS upgrade.
Can you tell us the name of the manager application you are using?
UpdateXpress System Pack Installer through a graphical user interface (GUI). http://publib.boulder.ibm.com/infocenter/toolsctr/v1r0/index.jsp?topic=/uxsp...
However, flashrom would be failed on “Verifying flash” step because some of programmed data in flash part is different with ones in golden ROM image file on random offset addresses. This defect occurs on Red Hat 5 x64, SLES 10 x64 and SLES 11 x64. If I run flashrom tool manually without this 3rd application, it would always flash SPI part successfully.
This is good. It means that flashrom works fine if nothing else accesses the flash chip.
Due to the 3rd party application is confidential, it can not be provided to you for defect reproduction.
Q1: May I suspect that expected data that is written to mapped virtual memory space by flashrom is overwritten or interfered by other process from 3rd application before SPI host controller operates these data to SPI ROM?
Q2: May I suspect that expected data that is written to mapped virtual memory space by flashrom is not sync to physical memory space immediately when SPI host controller operates these data to SPI ROM?
flashrom does not write to the mapped memory space of the flash chip on SP5100. flashrom uses only the SPI host controller registers to read/write. If any other software accesses (read or write) the mapped memory space of the flash chip or the SPI host controller registers at the same time, the SPI host controller registers will change in unexpected ways. This can lead to corruption of the flash chip or corruption of reads.
Q3: In order to prevent other process to interfere /dev/mem, mapped virtual memory space or physical memory space that flashrom uses, would you have any idea or some slice codes to lock them during flashrom operation?
Locking under most operating systems is not mandatory, and this means even if flashrom asks for a lock, all other applications can ignore the lock. I don't know any application which does locking on /dev/mem because that might interfere with X.org graphics and other software.
Q4: Would you have any idea to trace the root cause interfered flashrom?
I wrote a patch which should be able to detect interference from other applications. I will send it later once it is tested.
Thank you. I will try it to give you more inputs.
Verbose message: Programming flash done. COMPLETE. Verifying flash... VERIFY FAILED at 0x00017ca4! Expected=0xff, Read=0x25, failed byte count from 0x00000000-0x003fffff: 0x3a
Syntax: ./flashrom –w 24a.rom
Hardware configuration: MB: AMD platform solution South Bridge: AMD SP5100 Flash part: ST M25P32
NOS: Reg Hat 5 X86_X64 SLES 10 X86_X64 SLES 11 X86_X64
Could you please send the output of "flashrom -V" so I can check the contents of some SP5100 registers? Thanks. I will capture message with -V parameter on my lab tomorrow, and will provide you output. Regards, Carl-Daniel
flashrom mailing list flashrom@flashrom.org http://www.flashrom.org/mailman/listinfo/flashrom
Hi Hony,
can you please update flashrom to the latest version from svn and then run flashrom -V -r test.rom while the 3rd party application is running?
This should detect the corruption you were seeing. The corruption is no fixed yet, but detecting it is a good first step. Thanks!
Regards, Carl-Daniel