Julius Werner has posted comments on this change. ( https://review.coreboot.org/c/coreboot/+/39894 )
Change subject: tests: Add build subsystem and example for unit tests ......................................................................
Patch Set 1:
(4 comments)
https://review.coreboot.org/c/coreboot/+/39894/1/tests/Makefile.inc File tests/Makefile.inc:
https://review.coreboot.org/c/coreboot/+/39894/1/tests/Makefile.inc@26 PS1, Line 26: # Minimal required set for simplicity sake
Understand. […]
If this is included as part of coreboot's toplevel Makefile then yes, you should be able to just reference CFLAGS_common and CPPFLAGS_common (from $(top)/Makefile.inc).
Getting the full flags is actually somewhat complicated... CFLAGS_common is added to CFLAGS_<stage> in toolchain.inc, together with stage-specific flags. There's also generic-c-ccopts and <stage>-c-ccopts which get added to the final build flags as well. (There's no good reason why we have two ways of defining stage-specific flags, I think, IIRC that was just two people trying to implement the same thing at the same time. Could also be a good opportunity for cleanup.) Maybe we can get away with only using CFLAGS_common, since the stage-specific stuff should rarely affect compilation. There is of course code that behaves differently for different stages, but we may want to find a separate solution to mock that out anyway (to give individual tests the ability to change the values of ENV_BOOTBLOCK, ENV_RAMSTAGE, etc.).
Maybe it would be better to keep fully separate test flags after all and just manually copy what the tests need. I think -std=gnu11, -Os, -ffunction-sections, -fdata-sections and -Wl,--gc-sections are definitely important. All the warning options are probably unimportant. For things like -nostdlib, -nostdinc or -ffreestanding, you'll have to figure out how to best make it work (because you do want to generate a normal userspace program with a main() function, but the coreboot code still needs to be able to work the way it was designed to).
At lot of issues we're probably only really going to see when we try adding more complicated tests.
https://review.coreboot.org/c/coreboot/+/39894/1/tests/Makefile.inc@37 PS1, Line 37: TEST_LDFLAGS = -lcmocka You're not using any of the actual function mocking features of cmocka in your example, btw. Once you do, you'll need more linker flags here to allow the __wrap_ stuff to work.
https://review.coreboot.org/c/coreboot/+/39894/1/tests/Makefile.inc@50 PS1, Line 50: $$(addprefix $$(SRC_DIR),$$(subst -test,.c,$$*))
If this constraint cannot be met, it means that code is not well designed (or cannot be tested). Of course, we can adjust tests so that they will pass, but in my opinion this should go another way around - code should be modified to be testable and to pass all the tests.
Well, I would be careful with this mindset. We have to remember that we are dealing with firmware code and not any odd userspace tool. In userspace it may often be acceptable to design for testability first and performance/size/whatever later, and it's no big deal to add another function call boundary here and there or artificially overcomplicate an interface to make it easier to mock. But for firmware code, restrictions may be more serious. I think we all want to add tests, but we do not want to make coreboot any worse for it.
If there's really no way to test something without changes to coreboot we can evaluate whether those changes are acceptable, but needing to link more than one file doesn't really sound complicated enough to warrant that. It just sounds like a pretty arbitrary restriction to me.
Can you think of and point me to the example, where we would need to incorporate multiple files in the unit test binary? Sounds rather like an integration testing, not unit testing.
Well... hmmm... I guess I'm mostly worried about common helper stuff that's used everywhere. Say you want to test fmap_locate_area() in src/lib/fmap.c. A reasonable unit test would be that you have a hardcoded FMAP blob in the test and you try to check that you can find all areas in there. That basically only involves functions from that file, so it should work... except that it's also using the rdev API (src/commonlib/region.c) which is a an API used all over coreboot to interface with data sources (like SPI flash in actual firmware, or a mock buffer in memory in this case). The rdev API already supports in-memory buffers with mem_region_device so it would be very easy to just use that to mock out boot_device_ro() in my test and return a device representing my mocked FMAP buffer. But in order to do that, we have to be able to link src/commonlib/region.c somehow. (Of course there'd be another unit test covering the rdev API itself, so that when we use it to test fmap.c we can already be sure it is internally correct.)
When talking about splitting a function in two - does this happen frequently for functions available externally?
Yes, I think changing underlying interfaces while keeping the API the same is a common thing. We refactor and modernize stuff in coreboot all the time. I'm trying to come up with a good example... hmm... let's say the difference between these two patch sets that I recently reviewed: https://review.coreboot.org/c/coreboot/+/39002/6..7/src/lib/edid_fill_fb.c . Patrick originally just wrote this with a implicit linked list hardcoded in the structure. Then I mentioned we have src/lib/list.c instead which can be used for that. In this case this change already happened during review, but let's say patch set 6 got committed as is (including unit tests) and then some time later we decided to switch to a common list framework throughout coreboot. Suddenly all the unit tests for files that used their own list structures like this before would become invalid because of this otherwise transparent change. Does this mean the test that still tests the same thing suddenly became an integration test, just because the list framework got factored out? Regardless of the ideological answer to that question, I think it is nonsensical from a practical standpoint to throw away or rewrite all those tests in such a situation. They would still be perfectly fine tests -- in fact, you *want* to run them after that refactoring change to make sure it behaves with the new list implementation exactly the same as before, that's the main point of having tests! What we don't want is create a barrier to helpful refactoring and cleanup by creating an artificial requirement to rewrite a disproportionate amount of tests from a small, otherwise transparent change.
I think the most practical solution here would be to have a Makefile.inc in every test subdirectory (same as we do for the normal source tree) and then allow every test to specify which files it needs to link. With the right Makefile magic you should be able to make that look decent enough... for example you could have a test/lib/Makefile.inc with something like:
fmap-test += src/commonlib/region.c fmap-test += test/helpers/rdev_mock_helper.c # whether test/lib/fmap-test.c and src/lib/fmap.c need to be added explicitly or get deduced automatically is something we can decide
# could also do something like this for the header mocking I mentioned in the other patch fmap-test-includes += test/include/mocks/arch/cache_mocks
# would be extensible to provide other stuff we may find we need later, too... fmap-test-stage = verstage fmap-test-config += CONFIG_NO_FMAP_CACHE=y
https://review.coreboot.org/c/coreboot/+/39894/1/tests/include/config.h File tests/include/config.h:
PS1:
?Good question. I'm proposing to use qemu-x86 . […]
No, I definitely didn't mean that users should manually adjust .config when running certain tests. I meant that certain tests may need to specify extra configs they want to be built with that deviate from the normal qemu-x86 base config. So we'll need some way in the Makefile to specify that when it happens (see my suggestion in the other comment). Often we'll probably even want to test the same file with two different Kconfigs -- one where the option that changes that file's behavior is enabled and one where it's disabled. Since Kconfigs are compile-time constant (and that's probably not something we'd want to try to change, even for tests, it would probably break too many things) that would probably mean we want two test files for a single source file under test. (Or maybe just a single test file with #ifdefs that is built and run twice? Would have to see what works better in practice.)