Rudolf, thanks a lot for challenging me and clarifying the problem!
ron minnich wrote:
Rudolf's point is crucial: "Challenge accepted. They aren't [self defining] because they are defined with ABI/compiler:"
As Rudolf points out, we are defining a binary layout with a c compiler. That's known not to work.
Ron, I now understand what you mean by self-describing.
But I think we should focus only on the lack of explicit serialization.
I don't think self-describing is important; it would in fact be redundant and thus a source of possibly contradictory information which I absolutely do not want at boot. And it would make parsing a lot more complex without real benefit.
I think the key requirement is that we must use a well-defined, explicit serialization, even if simply "never any padding" aka "packed".
When considering tables in memory not as structs but using Ron's framing "a packet" transfered to somewhere else then serializing the fields without padding is not unreasonable at all!
I'm not saying we can use this, but if you use this string to generate an array of uint8_t, then you package the string with the array, you now have a self-describing structure, I believe.
The current cbtable tag-and-size system parsing a lot easier and quicker than a pack pattern would, we should not give that up.
I however agree completely that we should define serialization!
But I see no reason to introduce a pack pattern.
The tag already implies which specific members are contained and we can add tags when needed.
Again, what we have today is not self-describing,
Not a problem.
not portable to non-gcc toolchains or other kernels, and not portable even across kernel and compiler versions
This is the problem, and we should fix it for sure!
Further, because coreboot depends on gcc features,
As an aside, I disagree with Julius on this; even if we do so in some places we shouldn't do so frivolously. But let's stay on topic.
Consider the case of a 10y old coreboot, with a modern kernel (Linux) booting from it. How does linux parse the structures?
Existing tags will have to always mean "GCC ABI" alignment, but we can invest some time into explicitly defining those alignments that have been output by GCC up til now and explicitly serialize to that in order to ensure forward compatibility with old kernels.
Then there are at least never any further variations, and being explicit is always better than letting the compiler pick.
More importantly though, once we have defined a "coreboot ABI" serialization we should add new tags which only ever use that, while continuing to output also GCC ABI tags, until they demonstrably break all possible consumers, which I expect will be never.
Modern Linux then learns to prefer coreboot ABI tags with well-defined serialization and could output warnings about GCC-ABI tags and optionally try its best to parse them as it always has.
We can't magically fix 10y old coreboot from within Linux, that machine then deserves a firmware update, but at least we can fix this while maintaining both backwards and forward compatibility.
I propose that coreboot ABI serializes table members without padding.
What issues do you see there? x86 can access single bytes anywhere (really only relevant until RAM is available, right?) but would we have a problem on other architechtures?
(This btw also seems like a good example of a change that could and should benefit every single board that we've ever had code for but could require duplicated engineering to apply on diverged branches.)
Kind regards
//Peter