Hi, Once upon a time, testing flashrom was simple: Read, erase, write, verify, done.
Over the past few months, we've introduced some very important features to Flashrom. Unfortunately, testing is no longer quite so simple. Partial writes require that we test various patterns (and we do with a newly introduced torture script), and we do that to some extend with the torture test that we recently checked in. Embedded controllers introduce a whole other world of pain. Long story short, a better testing strategy is needed.
To get the ball rolling, I have a work-in-progress patch for the Chromium.org branch of Flashrom right now to introduce a unit testing framework and would appreciate thoughts / comments: http://codereview.chromium.org/5136001 . You can download the unified diff from there (as to avoid duplication here). It's written in shell script (tested on dash and bash) since I felt that's the lowest common denominator and the lowest barrier-to-entry for folks wanting to add/modify the tests. Feel free to post comments in this thread, since the code review will be closed once the code is checked in.
The idea is simple: One master script does generic setup, backs up the firmware image, runs unit tests, and unconditionally restores the firmware image at the end. The user sets a few (hopefully no more than two or three) environment variables and runs the master script, specifying desired unit tests as arguments on the command line. Each unit test should be independent, and should have its own documentation in case it needs special environment variables or something. Due to the variety of tests which may be performed, it's up to the user to decide which tests to run on their hardware.
If you're curious to try the patch, here is a simple invocation that will test partial writes for an x86 BIOS chip: ./do_tests.sh partial_writes_x86_bios.sh
I've attached some sample console output for a test run that does partial writes for x86 BIOS chip and some write protection commands. In this example, I had to pass in some env variables and specify multiple tests as arguments. The invocation: FLASHROM="../flashrom" FLASHROM_PARAM="-p internal:bus=spi" sh do_tests.sh partial_writes_x86_bios.sh chip_size.sh wp-toggle.sh wp-range.sh
If I were to test some EC stuff, I'd need to add additional parameters since my EC test script requires a special layout file and an secondary EC firmware image: LAYOUT_FILE=/tmp/ec_layout.txt ALT_EC_IMAGE=/tmp/alt_ec_image.bin FLASHROM="../flashrom" FLASHROM_PARAM="-p internal:bus=lpc" ./do_tests.sh partial_writes_ec.sh
Hi David,
On 23.11.2010 01:31, David Hendricks wrote:
Once upon a time, testing flashrom was simple: Read, erase, write, verify, done.
Over the past few months, we've introduced some very important features to Flashrom. Unfortunately, testing is no longer quite so simple. Partial writes require that we test various patterns (and we do with a newly introduced torture script), and we do that to some extend with the torture test that we recently checked in. Embedded controllers introduce a whole other world of pain. Long story short, a better testing strategy is needed.
Indeed. The biggest problem with any tests right now is that you can either write them for the flashrom-chromium tree or for mainline flashrom because the flashrom messages are incompatible/different.
To get the ball rolling, I have a work-in-progress patch for the Chromium.org branch of Flashrom right now to introduce a unit testing framework and would appreciate thoughts / comments: http://codereview.chromium.org/5136001 . You can download the unified diff from there (as to avoid duplication here). It's written in shell script (tested on dash and bash) since I felt that's the lowest common denominator and the lowest barrier-to-entry for folks wanting to add/modify the tests. Feel free to post comments in this thread, since the code review will be closed once the code is checked in.
"Code review will be closed" means the comments will be unaccessible after checkin?
The idea is simple: One master script does generic setup, backs up the firmware image, runs unit tests, and unconditionally restores the firmware image at the end.
Agreed. Here is an interesting question: Given that the initial backup and the final restore are done with the system flashrom binary, shouldn't we try to restore the backup directly after making the backup? If that fails, we can't restore the image at the end, and we're better off detecting this problem before we mess with the chip.
The user sets a few (hopefully no more than two or three) environment variables and runs the master script, specifying desired unit tests as arguments on the command line. Each unit test should be independent, and should have its own documentation in case it needs special environment variables or something. Due to the variety of tests which may be performed, it's up to the user to decide which tests to run on their hardware.
In this case we should warn the user that each of the tests has the potential to brick his device. AFAICS the partial write torture test can/will result in a hanging EC if the EC code lives in the bottom 128 kB of the flash chip.
If you're curious to try the patch, here is a simple invocation that will test partial writes for an x86 BIOS chip: ./do_tests.sh partial_writes_x86_bios.sh
I've attached some sample console output for a test run that does partial writes for x86 BIOS chip and some write protection commands. In this example, I had to pass in some env variables and specify multiple tests as arguments. The invocation: FLASHROM="../flashrom" FLASHROM_PARAM="-p internal:bus=spi" sh do_tests.sh partial_writes_x86_bios.sh chip_size.sh wp-toggle.sh wp-range.sh
Why the "sh" after FLASHROM_PARAM="..."?
If I were to test some EC stuff, I'd need to add additional parameters since my EC test script requires a special layout file and an secondary EC firmware image: LAYOUT_FILE=/tmp/ec_layout.txt ALT_EC_IMAGE=/tmp/alt_ec_image.bin FLASHROM="../flashrom" FLASHROM_PARAM="-p internal:bus=lpc" ./do_tests.sh partial_writes_ec.sh
The big question for the size and protection tests is what you actually intend to test and whether the test will work with any chips besides the one you wrote the tests for. This is not a complaint directed at the RFC, it is rather a hint that writing good tests is hard (but you already know that).
Overall, I think that more tests are good.
Regards, Carl-Daniel
Hi Carl-Daniel,
On Tue, Nov 23, 2010 at 11:50 AM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
Agreed. Here is an interesting question: Given that the initial backup and the final restore are done with the system flashrom binary, shouldn't we try to restore the backup directly after making the backup? If that fails, we can't restore the image at the end, and we're better off detecting this problem before we mess with the chip.
Good point! I think there's no big point in restoring it right away, because if that fails we have an inconsistent state anyways, so we won't mess much up when running the tests. But we could try to restore it again with the flashrom under test binary if a restore with the system binary fails. It might be that the system binary is old enough and the tested binary is at least able to get the flash part back into a non-bricked state (even if some of the tests might fail)
In this case we should warn the user that each of the tests has the potential to brick his device.
Absolutely! Maybe the do_test script should ask for manual confirmation if not run with a --non-interactive parameter.
AFAICS the partial write torture test can/will result in a hanging EC if the EC code lives in the bottom 128 kB of the flash chip.
Is there some way we can detect this? Or have a black list with devices that have this issue?
arguments. The invocation: FLASHROM="../flashrom" FLASHROM_PARAM="-p internal:bus=spi" sh do_tests.sh partial_writes_x86_bios.sh chip_size.sh wp-toggle.sh wp-range.sh
Why the "sh" after FLASHROM_PARAM="..."?
Assuming this is because util/ or . is not in the users path, I guess ./do_tests.sh would do the job, too.
The big question for the size and protection tests is what you actually intend to test and whether the test will work with any chips besides the one you wrote the tests for. This is not a complaint directed at the RFC, it is rather a hint that writing good tests is hard (but you already know that).
The test should _work_ with any flash chip unless there is a bug in flashrom. That does not necessarily imply that it tests something useful though.
Overall, I think that more tests are good.
Do you have suggestions what kinds of tests would be needed?
Does it make sense to add instrumentation (code coverage) support so we can figure out the test depth? gcc in user space makes this kind of trivial.
On Tue, Nov 23, 2010 at 11:50 AM, Carl-Daniel Hailfinger < c-d.hailfinger.devel.2006@gmx.net> wrote:
Hi David,
On 23.11.2010 01:31, David Hendricks wrote:
Once upon a time, testing flashrom was simple: Read, erase, write,
verify,
done.
Over the past few months, we've introduced some very important features
to
Flashrom. Unfortunately, testing is no longer quite so simple. Partial writes require that we test various patterns (and we do with a newly introduced torture script), and we do that to some extend with the
torture
test that we recently checked in. Embedded controllers introduce a whole other world of pain. Long story short, a better testing strategy is
needed.
Indeed. The biggest problem with any tests right now is that you can either write them for the flashrom-chromium tree or for mainline flashrom because the flashrom messages are incompatible/different.
Ah yes... Flashrom messages have been a pain point for us. The upstream messages are too verbose, annoy developers, and confuse scripts. The messages we have now were designed to work around issues as we encountered them. I hope to work on a long-term fix soon...
On Tue, Nov 23, 2010 at 11:50 AM, Carl-Daniel Hailfinger < c-d.hailfinger.devel.2006@gmx.net> wrote:
To get the ball rolling, I have a work-in-progress patch for the Chromium.org branch of Flashrom right now to introduce a unit testing framework and would appreciate thoughts / comments: http://codereview.chromium.org/5136001 . You can download the unified
diff
from there (as to avoid duplication here). It's written in shell script (tested on dash and bash) since I felt that's the lowest common
denominator
and the lowest barrier-to-entry for folks wanting to add/modify the tests. Feel free to post comments in this thread, since the code review
will
be closed once the code is checked in.
"Code review will be closed" means the comments will be unaccessible after checkin?
No. You can still make comments after code is submitted, but new patches will have a different URL. Either way, I figure it's best to consolidate general discussion here so that we don't have the discussion spread across multiple code reviews.
On Tue, Nov 23, 2010 at 11:50 AM, Carl-Daniel Hailfinger < c-d.hailfinger.devel.2006@gmx.net> wrote:
The idea is simple: One master script does generic setup, backs up the firmware image, runs unit tests, and unconditionally restores the
firmware
image at the end.
Agreed. Here is an interesting question: Given that the initial backup and the final restore are done with the system flashrom binary, shouldn't we try to restore the backup directly after making the backup? If that fails, we can't restore the image at the end, and we're better off detecting this problem before we mess with the chip.
Good point! However, before we implement that, we'll need to modify flashrom to force rewrite all blocks. Otherwise, all blocks will simply be skipped.
On Tue, Nov 23, 2010 at 11:50 AM, Carl-Daniel Hailfinger < c-d.hailfinger.devel.2006@gmx.net> wrote:
The user sets a few (hopefully no more than two or three) environment variables and runs the master script, specifying desired unit tests as arguments on the command line. Each unit test should be independent, and should have its own documentation in case it needs
special
environment variables or something. Due to the variety of tests which may
be
performed, it's up to the user to decide which tests to run on their hardware.
In this case we should warn the user that each of the tests has the potential to brick his device. AFAICS the partial write torture test can/will result in a hanging EC if the EC code lives in the bottom 128 kB of the flash chip.
An interesting thought, but I think people get used to ignoring warnings. What if we created two (or more) sub-directories to categorize tests and keep dangerous ones separate? For starters, let's call them default_tests/ and unsafe_tests/.
default_tests/ would contain tests that any normal user can run relatively safely to test for simple regressions. We'd put a command on the wiki page for random interested people to run all of these tests. Maybe even have a machine @ coreboot.org automatically run them on patches sent to the mailing list and e-mail a response if it failed.
unsafe_tests/* would have the other crazy shit like EC tests. Users would need to be much more careful about hardware compatibility, special environment variables that need to be set, etc.
On Tue, Nov 23, 2010 at 11:50 AM, Carl-Daniel Hailfinger < c-d.hailfinger.devel.2006@gmx.net> wrote:
If you're curious to try the patch, here is a simple invocation that will test partial writes for an x86 BIOS chip: ./do_tests.sh partial_writes_x86_bios.sh
I've attached some sample console output for a test run that does partial writes for x86 BIOS chip and some write protection commands. In this example, I had to pass in some env variables and specify multiple tests
as
arguments. The invocation: FLASHROM="../flashrom" FLASHROM_PARAM="-p internal:bus=spi" sh do_tests.sh partial_writes_x86_bios.sh chip_size.sh wp-toggle.sh wp-range.sh
Why the "sh" after FLASHROM_PARAM="..."?
No reason. I often run with sh -x for debugging purposes, that is probably just a remnant.
On Tue, Nov 23, 2010 at 11:50 AM, Carl-Daniel Hailfinger < c-d.hailfinger.devel.2006@gmx.net> wrote:
If I were to test some EC stuff, I'd need to add additional parameters
since
my EC test script requires a special layout file and an secondary EC firmware image: LAYOUT_FILE=/tmp/ec_layout.txt ALT_EC_IMAGE=/tmp/alt_ec_image.bin FLASHROM="../flashrom" FLASHROM_PARAM="-p internal:bus=lpc" ./do_tests.sh partial_writes_ec.sh
The big question for the size and protection tests is what you actually intend to test and whether the test will work with any chips besides the one you wrote the tests for. This is not a complaint directed at the RFC, it is rather a hint that writing good tests is hard (but you already know that).
Yes, indeed! There are numerous assumptions made about the size and capabilities of the chips that need to be abstracted better. I think your SPI chip emulator patch will help tremendously here.
On Tue, Nov 23, 2010 at 2:29 PM, David Hendricks dhendrix@google.com wrote:
On Tue, Nov 23, 2010 at 11:50 AM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote: Ah yes... Flashrom messages have been a pain point for us. The upstream messages are too verbose, annoy developers, and confuse scripts. The messages we have now were designed to work around issues as we encountered them. I hope to work on a long-term fix soon...
They're also a pain point for users. While "Error: WIP bit after WRSR never cleared" is nice for everyone who knows the code, I can imagine quite some people would find "Could not disable Write Protection" more useful.
Agreed. Here is an interesting question: Given that the initial backup and the final restore are done with the system flashrom binary, shouldn't we try to restore the backup directly after making the backup? If that fails, we can't restore the image at the end, and we're better off detecting this problem before we mess with the chip.
Good point! However, before we implement that, we'll need to modify flashrom to force rewrite all blocks. Otherwise, all blocks will simply be skipped.
By the time we figured out that we can not restore, the chip is already messed up, no?
In this case we should warn the user that each of the tests has the potential to brick his device. AFAICS the partial write torture test can/will result in a hanging EC if the EC code lives in the bottom 128 kB of the flash chip.
An interesting thought, but I think people get used to ignoring warnings.
Hence asking them "Do you want to proceed" and expecting them to enter "I consciously do" or some such. That will get their attention (or keep everyone from running tests)
What if we created two (or more) sub-directories to categorize tests and keep dangerous ones separate? For starters, let's call them default_tests/ and unsafe_tests/.
Are not all tests kind of unsafe? (except just reading the chip or looking at it's size, but that's unsafe too if we "recover" the image at the end.)
Yes, indeed! There are numerous assumptions made about the size and capabilities of the chips that need to be abstracted better. I think your SPI chip emulator patch will help tremendously here.
Hm.. can we use the chip emulator with some kind of error injection emulation and see how well flashrom detects certain errors?
Stefan
On 23.11.2010 23:47, Stefan Reinauer wrote:
On Tue, Nov 23, 2010 at 2:29 PM, David Hendricks dhendrix@google.com wrote:
Yes, indeed! There are numerous assumptions made about the size and capabilities of the chips that need to be abstracted better. I think your SPI chip emulator patch will help tremendously here.
Hm.. can we use the chip emulator with some kind of error injection emulation and see how well flashrom detects certain errors?
Sure. I posted such a patch a few weeks ago and I'm waiting for an ack.
http://patchwork.coreboot.org/patch/2246/
Regards, Carl-Daniel