Hi,
I got an interesting report today from a customer having problems with building LinuxBIOS and the payload with different compilers
The problem is that different compilers handle structure alignment differently, ie 2.95.x and 3.x have fundamental differences here:
Adding __attribute__ ((packed)) to the structures helps:
struct lb_memory_range { uint64_t start; uint64_t size; uint32_t type; #define LB_MEM_RAM 1 /* Memory anyone can use */ #define LB_MEM_RESERVED 2 /* Don't use this memory region */ #define LB_MEM_TABLE 16 /* Ram configuration tables are kept in */ } __attribute__ ((packed));
struct lb_memory { uint32_t tag; uint32_t size; struct lb_memory_range map[0]; } __attribute__ ((packed));
Since this is a table that is passed in memory, we do want it to be exactly as it is defined, with no extra padding of any kind to make it reliable information. So I consider adding __attribute__ ((packed)) a good solution for the problem.
If I get no good reasons against adding this, I will check it in later
Stefan
On Fri, 26 Nov 2004, Stefan Reinauer wrote:
The problem is that different compilers handle structure alignment differently, ie 2.95.x and 3.x have fundamental differences here:
stepan, from the point of view of Plan 9, these are the same compiler.
Adding __attribute__ ((packed)) to the structures helps:
Sadly, Plan 9 compilers do not support this attribute for very good technical reasons. The problem then is that these tables will be hard for Plan 9 to deal with.
struct lb_memory_range { uint64_t start; uint64_t size; uint32_t type; #define LB_MEM_RAM 1 /* Memory anyone can use */ #define LB_MEM_RESERVED 2 /* Don't use this memory region */ #define LB_MEM_TABLE 16 /* Ram configuration tables are kept in */
} __attribute__ ((packed));
There are two problems here.
The first problem is the use of binary structures, followed by the use of gcc attributes to make the structures have a certain "shape". I've just had a big tussle with this in Xen and Plan 9. Xen kind of assumes a gcc world and things fall apart badly when your compiler is not gcc. This is going to happen with these binary structures when our payloads are not built with gcc compatible compilers -- which is already the case with Plan 9 payloads.
Second problem is, what happens if this or other LinuxBIOS tables need to change at some future point? We're going to have different versions. Intel and Microsoft solve this versioning problem by putting a version number in binary tables. Hence in the _MP_ table there is a version flag. This versioning of binary tables is a headache; what if you have to boot a very old OS that realizes it can't parse table version 2, only table version 1? Oops. Trouble, that's what.
I think the big problem is the use of binary data structures. It shows how smart the Open Boot guys were to use strings, and they figured this out 16 years ago!
I think we should look at having linuxbios create strings of data for tables, not binary tables. Long term, the binary tables are going to cause us trouble, as they have already: having to use non-portable compiler options/attributes is a recipe for disaster. You only parse them once to turn them into internal binary data structures in the OS; performance is not an issue here.
In Plan 9, the parameters are passed in as keyword-value pairs, viz: totalmem=0x100000
This is easy to generate, and both Linux and Plan 9 and other OSes have more than enough functions in the kernel already to parse these. This removes special needs for alignment, packing, and so on.
If you want you can generate Forth tables, which are also simple, but in the end, I think we need to avoid the problems of binary tables.
My real preference is for S-expressions, as they are totally character-oriented tables that can still provide structure such as trees and tables, but I am not sure people will like S-expressions.
ron
Hi I'm new in this group and i have a vdr with an epia-m board. is there someone with a working bios for my board? I like to boot my debian from harddisk. If someone has a good working bios please send it to me :-) My email ist ben.sommer at berlin.de
Greetings from Berlin (Germany)
Ben Sommer
On Nov 28, 2004, at 9:34 PM, Ronald G. Minnich wrote:
On Fri, 26 Nov 2004, Stefan Reinauer wrote:
The problem is that different compilers handle structure alignment differently, ie 2.95.x and 3.x have fundamental differences here:
stepan, from the point of view of Plan 9, these are the same compiler.
Adding __attribute__ ((packed)) to the structures helps:
Sadly, Plan 9 compilers do not support this attribute for very good technical reasons. The problem then is that these tables will be hard for Plan 9 to deal with.
struct lb_memory_range { uint64_t start; uint64_t size; uint32_t type; #define LB_MEM_RAM 1 /* Memory anyone can use */ #define LB_MEM_RESERVED 2 /* Don't use this memory region */ #define LB_MEM_TABLE 16 /* Ram configuration tables are kept in */
} __attribute__ ((packed));
There are two problems here.
The first problem is the use of binary structures, followed by the use of gcc attributes to make the structures have a certain "shape". I've just had a big tussle with this in Xen and Plan 9. Xen kind of assumes a gcc world and things fall apart badly when your compiler is not gcc. This is going to happen with these binary structures when our payloads are not built with gcc compatible compilers -- which is already the case with Plan 9 payloads.
Second problem is, what happens if this or other LinuxBIOS tables need to change at some future point? We're going to have different versions. Intel and Microsoft solve this versioning problem by putting a version number in binary tables. Hence in the _MP_ table there is a version flag. This versioning of binary tables is a headache; what if you have to boot a very old OS that realizes it can't parse table version 2, only table version 1? Oops. Trouble, that's what.
I think the big problem is the use of binary data structures. It shows how smart the Open Boot guys were to use strings, and they figured this out 16 years ago!
I think we should look at having linuxbios create strings of data for tables, not binary tables. Long term, the binary tables are going to cause us trouble, as they have already: having to use non-portable compiler options/attributes is a recipe for disaster. You only parse them once to turn them into internal binary data structures in the OS; performance is not an issue here.
In Plan 9, the parameters are passed in as keyword-value pairs, viz: totalmem=0x100000
This is easy to generate, and both Linux and Plan 9 and other OSes have more than enough functions in the kernel already to parse these. This removes special needs for alignment, packing, and so on.
If you want you can generate Forth tables, which are also simple, but in the end, I think we need to avoid the problems of binary tables.
My real preference is for S-expressions, as they are totally character-oriented tables that can still provide structure such as trees and tables, but I am not sure people will like S-expressions.
I agree, we need to get away from binary structures. Much as I like S-expressions, strings containing key/value pairs is probably the most common and easily understandable way to do it.
Greg
"Ronald G. Minnich" rminnich@lanl.gov writes:
On Fri, 26 Nov 2004, Stefan Reinauer wrote:
I think the big problem is the use of binary data structures. It shows how smart the Open Boot guys were to use strings, and they figured this out 16 years ago!
open boot provides a single function you can call. And as I recall everything is done in terms of forth words. 32 bit or 64bit. My memory may be faulty but that does not sound like a string based interface.
I think we should look at having linuxbios create strings of data for tables, not binary tables. Long term, the binary tables are going to cause us trouble, as they have already: having to use non-portable compiler options/attributes is a recipe for disaster. You only parse them once to turn them into internal binary data structures in the OS; performance is not an issue here.
Performance is not an issue code size can be. Check out ACPI for the nasty version of needing too much parsing to get the data you need.
In Plan 9, the parameters are passed in as keyword-value pairs, viz: totalmem=0x100000
Which is not bad. But the implementation is totally naive.
This is easy to generate, and both Linux and Plan 9 and other OSes have more than enough functions in the kernel already to parse these. This removes special needs for alignment, packing, and so on.
What of etherboot, what of mkelfImage, and other simple utilities.
If you want you can generate Forth tables, which are also simple, but in the end, I think we need to avoid the problems of binary tables.
My real preference is for S-expressions, as they are totally character-oriented tables that can still provide structure such as trees and tables, but I am not sure people will like S-expressions.
If we really want to avoid problems what we need is a table definition checker that looks at the definitions of table entries and checks to see if they are portable and safe. I don't care if they are binary or string based, you can get into problems either way.
If someone wants to do a proof of concept of a string based implementation I am willing to consider ideas.
Be warned though that strings scare me because everyone thinks they are safe, and easy. When in fact they have the same essential complexity as binary data structures with simply a different set of limitations, and fewer tools provided in the compiler to make certain you don't mess up.
In addition there are cases that string based tables handle very poorly such as passing fonts.
With the current structure is there is information that is best passed in a string form we can define one or more table entries that allows us to implement that.
The next big step is to export the device tree via the linuxbios table to the outside world. In most cases the this decomposes nicely into a hierarchical set of devices. But with interrupts and a few of their kin it starts getting hard to describe hardware as a tree, making a graph necessary to handle the general case.
Fundamentally changing things is just something to think about for now.
The current problem while very bad is still mild enough the current table definitions can be patched. It is unfortunate that this happened with the most widely used table entry.
Eric
On Mon, 29 Nov 2004, Eric W. Biederman wrote:
Performance is not an issue code size can be. Check out ACPI for the nasty version of needing too much parsing to get the data you need.
no argument, we don't want to recreate ACPI
totalmem=0x100000
Which is not bad. But the implementation is totally naive.
??
What of etherboot, what of mkelfImage, and other simple utilities.
again, the x=y stuff is dead simple to parse. Keep in mind, there is the possibility as time goes on of matching old linuxbios to new payloads, or even in the worse case for testing old payloads to new linuxbios.
You're either going to need to: 1- fix the table for all time (not practical) 2- version number the table (which can detect version mismatch but can't fix it) 3- go to a string-based organization
now the fact is that etherboot parses all kinds of strings in packets, so I don't see parsing linuxbios strings a big deal; same with the others. They all deal with strings of one sort or another.
In addition there are cases that string based tables handle very poorly such as passing fonts.
You lost me on this one. Are we planning to pass fonts in the tables in whatever form they take?
With the current structure is there is information that is best passed in a string form we can define one or more table entries that allows us to implement that.
That won't fix the fundamental problem with binary tables.
I think what Stepan's problem has shown, as my problems with Xen have shown, that planning to pass structs through and assume it will always work is going to get us in trouble.
We don't have to do keyword-value pairs, but the binary table approach is not the clear win it first appeared to be.
ron
Stefan Reinauer stepan@openbios.org writes:
Hi,
I got an interesting report today from a customer having problems with building LinuxBIOS and the payload with different compilers
The problem is that different compilers handle structure alignment differently, ie 2.95.x and 3.x have fundamental differences here:
I agree that there is a problem with the definition of this structure. Everything in the table remains 32bit aligned so we are clear there. For 64bit data types the alignment is much less clear, and struct lb_memory_range is a bit of a problem in that it does not have a 32bit type as padding.
To see what the problem actually amounts to I walked through my compiler collection with a simple test to see what sizeof reported for lb_memory_range, (without adding the packed attribute).
#include <stdio.h> #include "linuxbios_tables.h"
int main(int argc, char **argv) { printf("sizeof(lb_memory_range): %d\n", sizeof(struct lb_memory_range)); return 0; }
On 32bit x86 I tested with: gcc-2.7.2 gcc-2.95 gcc-3.0 gcc-3.2 gcc-3.3 gcc-3.4
And in each instance the result was 20.
On 64bit x86 I with I tested with gcc-3.2 gcc-3.3 gcc-3.4
And the result was 24.
I also managed to test with gcc-3.3 with -m32 and the size was still 20.
So from what I can see with 32bit x86 code we are consistent, and we do not have compiler version dependencies. So the bad definition is consistent.
Moving forward we need to remove this table entry and replace it with a table entry that is properly defined. Utilities like mkelfImage can continue to support the old definition, but it should be deprecated there.
Since this is a table that is passed in memory, we do want it to be exactly as it is defined, with no extra padding of any kind to make it reliable information. So I consider adding __attribute__ ((packed)) a good solution for the problem.
It is a good pragmatic solution, but actually needing __attribute__ ((packed)) is an issue. As Ron has pointed out, not all compilers support it. And having a definition that varies between 32bit and 64bit is a problem anyway.
If I get no good reasons against adding this, I will check it in later
What I want to do is add a replacement for struct lb_memory_range that looks like:
struct lb_memory_range2 { uint64_t start; uint64_t size; uint32_t type; #define LB_MEM_RAM 1 /* Memory anyone can use */ #define LB_MEM_RESERVED 2 /* Don't use this memory region */ #define LB_MEM_TABLE 16 /* Ram configuration tables are kept in */ uint32_t reserved; };
At the same time I need to define an LB_TAB_ALIGN that I can insert when data is only 32bit aligned and I need 64bit or better alignment in the table. 64bit alignment is unlikely to cost much at this point but better safe than sorry :) And it preserves the property that the data in the table can just be used.
Eric
* Eric W. Biederman ebiederman@lnxi.com [041129 19:21]:
To see what the problem actually amounts to I walked through my compiler collection with a simple test to see what sizeof reported for lb_memory_range, (without adding the packed attribute).
On 32bit x86 I tested with: gcc-2.7.2 gcc-2.95 gcc-3.0 gcc-3.2 gcc-3.3 gcc-3.4
And in each instance the result was 20.
So from what I can see with 32bit x86 code we are consistent, and we do not have compiler version dependencies. So the bad definition is consistent.
x86 QNX/Neutrino 6.2.1 (with gcc 2.95.3): # gcc -v Reading specs from /usr/lib/gcc-lib/ntox86/2.95.3qnx-nto/specs gcc version 2.95.3qnx-nto 20010315 (release)
sizeof(lb_memory_range): 24 sizeof(struct lb_memory): 8
It seems gcc does not always behave the same.
It is a good pragmatic solution, but actually needing __attribute__ ((packed)) is an issue. As Ron has pointed out, not all compilers support it. And having a definition that varies between 32bit and 64bit is a problem anyway.
So you are saying people out there are building LinuxBIOS with non-gnu compilers? I actually doubt that, assuming a lot of objcopy/objdump/ld magic is pretty much gnu specific as well..
one could go like:
#ifdef __GNUC__ #define STRICTSIZE __attribute__ ((packed)) #else #define STRICTSIZE #endif
Fixing the issue among all gcc versions while not breaking anything on others. It really sucks that gcc does magic here that makes writing portable code really ugly.
Stefan
On Tue, 30 Nov 2004, Stefan Reinauer wrote:
So you are saying people out there are building LinuxBIOS with non-gnu compilers? I actually doubt that, assuming a lot of objcopy/objdump/ld magic is pretty much gnu specific as well..
no, what i'm saying is that there are people (me, at least) building payloads with non-gnu compilers. If you use gcc magic to fix that struct, and said magic is not available in the other compilers, then the payload will not be able to read the table easily. I've had this very same problem passing structs between Xen and Plan 9; Xen relies on every gcc trick in the book, which is very hard to deal with when you have a non-gcc compiler.
Plus, even if you get all the gcc flags done correctly, we still have the table version problem, as has been shown with the various versions of the SMP tables.
It's been shown in practice that you can avoid these problems by not using a binary table; plan 9 has used strings for 16 years now with no ill effects. I would NOT recommend that V2 use a binary table but there seems to be some resistance to a strings-based table. In any event, V3 will use a string-based interface.
Fixing the issue among all gcc versions while not breaking anything on others. It really sucks that gcc does magic here that makes writing portable code really ugly.
it's only an issue because we're trying to using binary tables in a way that is simply not appropriate for portability. The problem is easily fixed, but there seems to be some resistance to doing this in V2.
ron