[coreboot] r3777 build service

Carl-Daniel Hailfinger c-d.hailfinger.devel.2006 at gmx.net
Sun Nov 30 01:51:45 CET 2008


On 29.11.2008 11:53, Stefan Reinauer wrote:
> Carl-Daniel Hailfinger wrote:
>   
>> I think I can fix that serialization problem,
>>     
> Aha? Mind to share the insight?
>   

Sure. You found the bug. Really. Remember what you wrote earlier? Let me
quote you:

> The problem is that we end up with a valid option_table.o file (not
> truncated) but it has no symbols and no data in there.
> The file is 900 something bytes, which is exactly the size of an elf
> object created by
> touch foo.c
> gcc -c foo.c -o foo.o

And that's exactly what was happening. How?

Look at how option_table.c and option_table.o are generated.
util/options/build_opt_tbl.c is doing that and it seems to be doing this
in a way that confuses make.
How can creating a simple file confuse make? GCC does it all the time.
The answer is that gcc does it differently.
Make totally depends on timestamps. It also assumes that if a file is
present, it is usable. (Reread the last sentence, it is important.)
The only way to make sure that a file is usable directly in the instant
it is created (to avoid race conditions) is to demand creating and
writing the whole file has to be one atomic operation. There is no way
to do that directly with the standard fopen/fwrite/fclose.
There is one way out: Use an atomic operation to make the whole file
available. Rename is atomic. Create(open)/write/close a file with a
temporary unique name, and after that is done, rename it to the file you
wanted to create in the first place.
gcc does it that way and make is happy. build_opt_tbl creates the files
directly and has a HUGE race between fopen and fclose. We're hitting
that race condition.

How do we solve it? Two ways are possible:
1. Fix build_opt_tbl.c to use fopen/fwrite/fclose/rename.
2. Perform that logic in the makefile.
Fixing build_opt_tbl.c is IMHO the preferred course because it avoids
hacks in makefiles.

I'd provide a patch, but my right hand is injured and I am typing with
my left hand only (no worries ;-)).


>> but it seems r3777 makes
>> the situation a bit better than it was before. That alone is an improvement.
>> Stefan, you are right about the time/result tradeoff. If the failure
>> rate stays low enough, we might want to leave your fix in place and
>> simply accept the occassional failure.
>>     
> Oh, we do want to leave my fix in place, even if you come up with
> another fix on top of that.
>   

Absolutely. (Technically, the problem will be fixed completely by fixing
build_opt_tbl.c even if r3777 is reverted.) Improving make rules for
better readability and less overhead is something I'd call a requirement
as well, so r3777 is a keeper.


> Having two rules for the same set of files is quite unhealthy.
>   

Yes, that's true. We had a similar problem some time back in v2. v3 also
has them, but they are not showing up yet. I should resend my v3
dependency fix RFC, maybe it gathers more interest this time.


Regards,
Carl-Daniel

-- 
http://www.hailfinger.org/





More information about the coreboot mailing list