Thanks to the tireless efforts of Uwe and with the assistance of Ward and Marc, libpayload and coreinfo are fully up to date and feature-filled, and now an option in buildrom. Libpayload is the backend for two working payloads (coreinfo and tint) and more on the way.
It is now time to start considering the next great payload challenge. Immediately after I demonstrated coreinfo to Ron, he said - "okay, now we need a chooser". And the thing about Ron is, when he is right, he is right. A master payload to chose and load other payloads is the next great step in our effort. Originally we had only discussed a menu based chooser, but recently many people have told me how they would like to see a loader that could chain multiple payloads together in order to cobble together a reasonable facsimile of a traditional BIOS setup screen.
The more I have thought about this, the more I think that this effort is key to proving (and improving) the stability and versatility of coreboot-v3. It will also force us to examine much of our current implementation, which is not a bad thing. So, short story long, I have written down the evolving ideas in my head and the resulting roadmap here:
"Bayou" is the working name for the project - visit the page to find out why.
This is also going to involve a new loader format that we discussed at the summit:
Please critically review both and post comments here or in the discussion pages.
Thanks! Jordan
PS: I very nearly called it 'corechooser' as a joke, but I think that would have caused Peter to track me down and hunt me for sport.
one simple suggestion. Your types are 32 bits. But the type numbers are things like 1, 2, 3 etc:
I think it's worth it to bite the bulet and use more than the 3 lowest bits of the 32 bits, for several reasons: 1. it's easier to detect a bogus type if you use more bits, and bogus type detection is good for bogus problems. 2. With a jtag debugger, it's easier to verify that it really is a type and not just the number '1', etc. 3. use those bits! They're there for a reason :-)
So, ignore me if you will, but how about 'CODE' 'DATA' 'BSS ' 'NAME' 'NOTE' 'ENTR'
but in all other ways it looks really neat. I can't wait to see it run. and feel free to ignore this suggestion.
Now if only I had this blasted geode figure out ;-)
ron
On Fri, Apr 11, 2008 at 05:28:01PM -0600, Jordan Crouse wrote:
Libpayload is the backend for two working payloads (coreinfo and tint) and more on the way.
I must get another alix.
The more I have thought about this, the more I think that this effort is key to proving (and improving) the stability and versatility of coreboot-v3.
Certainly usability as well.
I like it a lot. It's what we (I at least) envisioned when starting on LAR in Hamburg. I can feel it coming to reality. :)
SELF nicely eliminates all those ELF segments in LARs!
..but wouldn't it be good to explicitly use 64-bit types?
PS: I very nearly called it 'corechooser' as a joke, but I think that would have caused Peter to track me down and hunt me for sport.
Close call, on another day I might not have acted on the impulse to call for some new names. Bayou isn't so explaining but still massively better than corechooser IMO. :)
On Fri, Apr 11, 2008 at 04:43:13PM -0700, ron minnich wrote:
- use those bits! They're there for a reason :-)
'CODE' 'DATA' 'BSS ' 'NAME' 'NOTE' 'ENTR'
I can't help but think this is cute. There won't be enough types to run out of names, it can help, and it's sort-of an easter egg. :p
All thumbs up from me.
//Peter
On Fri, Apr 11, 2008 at 5:05 PM, Peter Stuge peter@stuge.se wrote:
I must get another alix.
yes you must. We need you to help us find this IRQ issue :-)
and somebody tell me how to get jtag going. Fer cheap.
ron
On Fri, Apr 11, 2008 at 05:21:14PM -0700, ron minnich wrote:
On Fri, Apr 11, 2008 at 5:05 PM, Peter Stuge peter@stuge.se wrote:
I must get another alix.
yes you must. We need you to help us find this IRQ issue :-)
:) I'm ordering a 1c and a 3c3 now.
and somebody tell me how to get jtag going. Fer cheap.
One of the USB->serial cables I posted links to on the list a month or so back claimed to do JTAG and had OSS to drive it that could read and play back several common bitstream formats.
http://openocd.berlios.de/web/ http://www.ixo.de/info/usb_jtag/
As the latter page says, start from the development software that you're using, and look at what cables/adapters it supports.
Is it for the Geode? What software would be used?
//Peter
On 11/04/08 17:21 -0700, ron minnich wrote:
On Fri, Apr 11, 2008 at 5:05 PM, Peter Stuge peter@stuge.se wrote:
I must get another alix.
yes you must. We need you to help us find this IRQ issue :-)
and somebody tell me how to get jtag going. Fer cheap.
I have started the process of asking for documentation - we'll see where the rabbit hole leads us.
Jordan
On 11/04/08 16:43 -0700, ron minnich wrote:
one simple suggestion. Your types are 32 bits. But the type numbers are things like 1, 2, 3 etc:
I think it's worth it to bite the bulet and use more than the 3 lowest bits of the 32 bits, for several reasons:
- it's easier to detect a bogus type if you use more bits, and bogus
type detection is good for bogus problems. 2. With a jtag debugger, it's easier to verify that it really is a type and not just the number '1', etc. 3. use those bits! They're there for a reason :-)
So, ignore me if you will, but how about 'CODE' 'DATA' 'BSS ' 'NAME' 'NOTE' 'ENTR'
I dig this. We will make it so. I think we should also force all the header members to be little endian to avoid issues later on with PowerPC.
Objections?
Jordan
On 13.04.2008 01:06, Jordan Crouse wrote:
On 11/04/08 16:43 -0700, ron minnich wrote:
one simple suggestion. Your types are 32 bits. But the type numbers are things like 1, 2, 3 etc:
I think it's worth it to bite the bulet and use more than the 3 lowest bits of the 32 bits, for several reasons:
- it's easier to detect a bogus type if you use more bits, and bogus
type detection is good for bogus problems. 2. With a jtag debugger, it's easier to verify that it really is a type and not just the number '1', etc. 3. use those bits! They're there for a reason :-)
So, ignore me if you will, but how about 'CODE' 'DATA' 'BSS ' 'NAME' 'NOTE' 'ENTR'
I dig this. We will make it so. I think we should also force all the header members to be little endian to avoid issues later on with PowerPC.
Ack.
Regards, Carl-Daniel
Hi Jordan,
please don't take my comments as criticism, I'm merely trying to learn something from this discussion.
2008/4/13, Jordan Crouse jordan.crouse@amd.com:
On 11/04/08 16:43 -0700, ron minnich wrote:
So, ignore me if you will, but how about 'CODE' 'DATA' 'BSS ' 'NAME' 'NOTE' 'ENTR'
I dig this. We will make it so. I think we should also force all the header members to be little endian to avoid issues later on with PowerPC.
I wonder what issues with PowerPC you're expecting? Shouldn't you have the header members stored in the endianness of the target architecture? I.e. if you construct a SELF file for i386, then of course the header fields should be little endian, even if you chose to construct the file on a PowerPC machine. But not if you want to use the SELF file on a big endian architecture - then the headers should be big endian.
In a later email in this thread, you say this:
2008/4/13, Jordan Crouse jordan.crouse@amd.com: [...]
We already know what endianism, word size and architecture we're going to run on - SELF is not intended to be portable. It is constructed when the LAR is, and is married to that LAR and that LAR only. And before somebody says something, yes - this is not fool-proof. Somebody will no doubt manage to screw it up and get the wrong SELF on the wrong architecture. But I don't like over-architecting to protect fools. Worrying about specifying the architecture here is the computer science equivalent of the "Caution, coffee is hot" warning.
I interpreted the above like this: When constructing a SELF file you will always know what architecture the SELF file will be used on and you therefore don't need an explicit endianness flag. You don't need it b/c you implicitly assume the byte order of the target architecture.
So I may have misunderstood your emails, but please don't force the SELF headers to a specific endianness. I don't think it's necessary and doing so will cause unnecessary work as soon as you're trying to use the format on an architecture w/ different endianness. See EFI and especially TianoCore.
Regards,
Phil
On 13/04/08 15:44 +0200, Philip Schulz wrote:
I wonder what issues with PowerPC you're expecting? Shouldn't you have the header members stored in the endianness of the target architecture? I.e. if you construct a SELF file for i386, then of course the header fields should be little endian, even if you chose to construct the file on a PowerPC machine. But not if you want to use the SELF file on a big endian architecture - then the headers should be big endian.
Thats true, they will be. But there is a problem:
The problem is that the loader code will not change. When the loader reads the type value, it will differ depending on endianism. As an example, 'CODE' will be equivalent to 0x45444F43 on a little endian machine, and 0x434F4445 on a big endian machine. This causes obvious issues in the code. By dictating that all the header values should be in little endian we avoid this problem.
In a later email in this thread, you say this:
2008/4/13, Jordan Crouse jordan.crouse@amd.com: [...]
We already know what endianism, word size and architecture we're going to run on - SELF is not intended to be portable. It is constructed when the LAR is, and is married to that LAR and that LAR only. And before somebody says something, yes - this is not fool-proof. Somebody will no doubt manage to screw it up and get the wrong SELF on the wrong architecture. But I don't like over-architecting to protect fools. Worrying about specifying the architecture here is the computer science equivalent of the "Caution, coffee is hot" warning.
I interpreted the above like this: When constructing a SELF file you will always know what architecture the SELF file will be used on and you therefore don't need an explicit endianness flag. You don't need it b/c you implicitly assume the byte order of the target architecture.
Thats true. But again, like I said, what concerns us is that endianism affects the order in which the bytes will be stored in the header
This has nothing to do with how the actual data and code is organized - we have to assume that is correctly formatted for the system, otherwise the payload won't run. Specifying that makes no sense - is somebody really going to take up valuable ROM space with a payload for another architecture?
So I may have misunderstood your emails, but please don't force the SELF headers to a specific endianness. I don't think it's necessary and doing so will cause unnecessary work as soon as you're trying to use the format on an architecture w/ different endianness. See EFI and especially TianoCore.
The endianism is specified by the processor, not the software. A x86 is little endian, forever more.
And since v3 only works on an x86, we could ignore it for now. But we know that the problem exists, and we might as well account for it now. By specifying it as little endian, we do put other architectures at a disadvantage. But none of those other architectures are in play for a very long time. You have to play to your strengths.
Jordan
The problem is that the loader code will not change. When the loader reads the type value, it will differ depending on endianism. As an example, 'CODE' will be equivalent to 0x45444F43 on a little endian machine, and 0x434F4445 on a big endian machine. This causes obvious issues in the code. By dictating that all the header values should be in little endian we avoid this problem.
Either the code will use u32 everywhere, or it will use u8[4] (or char[4]) everywhere. Either way, it works fine; the second option is better of course, since the binary will be more readable (in a hexdump).
You have to be careful in your tools, so that cross-builds work correctly.
Thats true. But again, like I said, what concerns us is that endianism affects the order in which the bytes will be stored in the header
That depends on how *exactly* you define the SELF format. Writing a *proper* specification is a lot of work, indeed!
The endianism is specified by the processor, not the software.
Not true. There are *many* processors that can run either little-endian or correct-endian: ARM, MIPS, PowerPC, ...
A x86 is little endian, forever more.
Yeah, poor x86.
And since v3 only works on an x86,
I hope to change that soon.
we could ignore it for now.
No you cannot, if you're defining this new binary format. Yet another reason why you really shouldn't.
But we know that the problem exists, and we might as well account for it now. By specifying it as little endian, we do put other architectures at a disadvantage. But none of those other architectures are in play for a very long time. You have to play to your strengths.
???
What are you saying here?
Segher
On 13/04/08 19:03 +0200, Segher Boessenkool wrote:
The endianism is specified by the processor, not the software.
Not true. There are *many* processors that can run either little-endian or correct-endian: ARM, MIPS, PowerPC, ...
Thats true, but lets not confuse the issue. By the time that it ever gets to our little part of the world, the endianism has been established, either in the silicon, by a resistor strap, or by a bit.
And since v3 only works on an x86,
I hope to change that soon.
Thats your decision.
we could ignore it for now.
No you cannot, if you're defining this new binary format. Yet another reason why you really shouldn't.
Indeed we cannot ignore the problem.
But we know that the problem exists, and we might as well account for it now. By specifying it as little endian, we do put other architectures at a disadvantage. But none of those other architectures are in play for a very long time. You have to play to your strengths.
???
What are you saying here?
I believe that the primary market for coreboot is x86, and will be for a very long time (possibly for ever). You are welome to prove me wrong,
Jordan
I believe that the primary market for coreboot is x86, and will be for a very long time (possibly for ever).
I, on the other hand, believe the embedded market is a lot bigger. Pretty much every embedded board uses its own coreboot equivalent -- well, not equivalent, we really want coreboot to not be riddled with bugs, and it to be flexible and user-friendly, and packed with features and all that goodness.
You are welome to prove me wrong,
Don't worry, I will. Ooh, a challenge! :-)
Segher
On 13.04.2008 19:55, Segher Boessenkool wrote:
I believe that the primary market for coreboot is x86, and will be for a very long time (possibly for ever).
I, on the other hand, believe the embedded market is a lot bigger. Pretty much every embedded board uses its own coreboot equivalent -- well, not equivalent, we really want coreboot to not be riddled with bugs, and it to be flexible and user-friendly, and packed with features and all that goodness.
Indeed. Conquer parts of the x86 market first, then expand to other embedded markes with MIPS/ARM/etc.
You are welome to prove me wrong,
Don't worry, I will. Ooh, a challenge! :-)
A week ago three people wanted to run coreboot v3 on MIPS.
Regards, Carl-Daniel
On 13.04.2008, at 11:09, Carl-Daniel Hailfinger <c-d.hailfinger.devel.2006@gmx.net
wrote: A week ago three people wanted to run coreboot v3 on MIPS.
Yeah! Did anything result from that yet?
Still, beating 1-10 million coreboot systems on x86 will take a while ;-)
-- coreboot mailing list coreboot@coreboot.org http://www.coreboot.org/mailman/listinfo/coreboot
On 12.04.2008 01:28, Jordan Crouse wrote:
It is now time to start considering the next great payload challenge. Immediately after I demonstrated coreinfo to Ron, he said - "okay, now we need a chooser". And the thing about Ron is, when he is right, he is right.
Agreed.
A master payload to chose and load other payloads is the next great step in our effort.
And this has nothing to do with what you quoted above. AFAIK your quote of Ron didn't say we need a loader which loads a loader. In the pathological case of coreboot->chooser->filo->lab we would have a chain of 4 loaders. That's embarrassing. The normal case would become a 2-loader case (coreboot->chooser->...) instead of the 1-loader case (coreboot->...) it is now.
Originally we had only discussed a menu based chooser, but recently many people have told me how they would like to see a loader that could chain multiple payloads together in order to cobble together a reasonable facsimile of a traditional BIOS setup screen.
The more I have thought about this, the more I think that this effort is key to proving (and improving) the stability and versatility of coreboot-v3. It will also force us to examine much of our current implementation, which is not a bad thing. So, short story long, I have written down the evolving ideas in my head and the resulting roadmap here:
I'll comment on Bayou later.
"Bayou" is the working name for the project - visit the page to find out why.
This is also going to involve a new loader format that we discussed at the summit:
Please critically review both and post comments here or in the discussion pages.
NAK. This design is so unfinished it is not even funny. Hint: Certain fields in the LAR header are there for a reason (guess which). There's also an obvious speed penalty for SELF (guess why). The concept of PIC is missing completely. A LAR parser can't figure out if the archive is corrupt. Ron once stated the ability to figure out whether the ROM is corrupt before you flash it is one of the key features of LAR. SELF completely destroys that feature. We might as well kill LAR completely and move to SELF only (and then SELF slowly will become a bad reinvention of LAR). Oh, and using 32 bits for load address and entry point is a step back to the situation we had 8 months ago. Sorry.
PS: I very nearly called it 'corechooser' as a joke, but I think that would have caused Peter to track me down and hunt me for sport.
Well, having a common naming theme would certainly help brand recognition. So I partially disagree with Peter on that point. OTOH, if you really want to avoid names like corechooser, you should rename coreinfo as well.
Regards, Carl-Daniel
Carl-Daniel Hailfinger wrote:
NAK. This design is so unfinished it is not even funny. Hint: Certain fields in the LAR header are there for a reason (guess which).
Yes, and because that concept is so incredibly broken beyond belief, we fixed it up at the summit. Those completely archive-unrelated fields which just sneaked into lar because we did not do our reviews thoroughly enough will finally go away again. Way to go.
There's also an obvious speed penalty for SELF (guess why).
The opposite. There's a speed win because we do less work in lar walks.
The concept of PIC is missing completely.
Because it is not needed.
A LAR parser can't figure out if the archive is corrupt.
Plain wrong. There's checksums in the lar, and that wont change. Even here you will see a reasonable speed-up because you don't have to load 3 segments to memory before you find the 4th being corrupted.
Ron once stated the ability to figure out whether the ROM is corrupt before you flash it is one of the key features of LAR. SELF completely destroys that feature.
Not at all - The opposite: It makes lar finally more solid again. The feature of checksumming (could have) worked nicely with the very first version of lar that I wrote. Just the actual checking was not in place as it is a minor issue when you can't even boot.
We might as well kill LAR completely and move to SELF only (and then SELF slowly will become a bad reinvention of LAR).
You are seriously misunderstanding the concepts of LAR and SELF. While LAR is an archiver of arbitrary(!) files, SELF is the container for executable files. Mixing these two up was the biggest mistake in the history of the young LAR, making things irreversible and fragile.
When writing software, one should think about the problem that is going to be solved. The reason for unpacking segments was to get rid of the ELF loader and being able to streamload code. This has nothing to do with anything I designed LAR for and the fact that the implementation was married into LAR was obviously a foul hack.
Stefan
On 12.04.2008 10:28, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
NAK. This design is so unfinished it is not even funny. Hint: Certain fields in the LAR header are there for a reason (guess which).
Yes, and because that concept is so incredibly broken beyond belief, we fixed it up at the summit. Those completely archive-unrelated fields which just sneaked into lar because we did not do our reviews thoroughly enough will finally go away again. Way to go.
SELF is missing a checksum for each uncompressed segment.
SELF is missing a per-section compression algorithm specifier and if you introduce one, you have a compression algorithm specifier both in LAR and in SELF. If you remove it from LAR, you have no way to store compressed data in a LAR directly. You can work around that by wrapping every file with a SELF header, but then SELF becomes a generic file container and violates your statement that "SELF is the container for executable files".
There's also an obvious speed penalty for SELF (guess why).
The opposite. There's a speed win because we do less work in lar walks.
Sorry, you misunderstood. The obvious speed penalty comes from unaligned accesses. Some architectures can't even handle unaligned accesses at all.
The concept of PIC is missing completely.
Because it is not needed.
So you propose to handle some executable code (bootblock, raminit) just with LAR and not with SELF? How are you going to explain that concept a few years down the road?
A LAR parser can't figure out if the archive is corrupt.
Plain wrong. There's checksums in the lar, and that wont change. Even here you will see a reasonable speed-up because you don't have to load 3 segments to memory before you find the 4th being corrupted.
Wrong. Checksums say nothing about corrupt internal SELF structure. The LAR parser has the ability (well, at least in theory) to check for overlapping archive members right now. With SELF, a corrupt SELF header can reference arbitrary places in the archive and a pure LAR parser can't find that out because it does not parse SELF by design.
Ron once stated the ability to figure out whether the ROM is corrupt before you flash it is one of the key features of LAR. SELF completely destroys that feature.
Not at all - The opposite: It makes lar finally more solid again. The feature of checksumming (could have) worked nicely with the very first version of lar that I wrote. Just the actual checking was not in place as it is a minor issue when you can't even boot.
And since a pure LAR parser does not understand SELF, you need a combined SELF+LAR parser to know whether the ROM has a chance to boot.
We might as well kill LAR completely and move to SELF only (and then SELF slowly will become a bad reinvention of LAR).
You are seriously misunderstanding the concepts of LAR and SELF. While LAR is an archiver of arbitrary(!) files, SELF is the container for executable files. Mixing these two up was the biggest mistake in the history of the young LAR, making things irreversible and fragile.
The bootblock and initram are executable files. They should be contained in SELF as well (and that means you should be able to state that the code is XIP). Tables which should be loaded to a given address in memory are by definition NOT code, yet you have to supply a load address (something that will only be available to SELF), so SELF is in no way "the" container for executable files, but a container for files which need at least part of them loaded to a given address.
When writing software, one should think about the problem that is going to be solved. The reason for unpacking segments was to get rid of the ELF loader and being able to streamload code. This has nothing to do with anything I designed LAR for and the fact that the implementation was married into LAR was obviously a foul hack.
I agree that the current LAR implementation and design is a one-size-fits-all compromise. Unfortunately, the current SELF+LAR proposal is not able to keep the design simple and still perform all tasks we currently use LAR for.
I'm all for revising an existing design as long as the new design is better or equal in all aspects. LAR+SELF probably can be beaten into shape, but it won't resemble the current proposal.
Regards, Carl-Daniel
Am Samstag, den 12.04.2008, 13:16 +0200 schrieb Carl-Daniel Hailfinger:
SELF is missing a checksum for each uncompressed segment.
It has a checksum for the whole file (in LAR), isn't that enough?
SELF is missing a per-section compression algorithm specifier and if you introduce one, you have a compression algorithm specifier both in LAR and in SELF. If you remove it from LAR, you have no way to store
BSS is defined to be the implicit-RLE-0 compressor. Everything else probably defaults to a single one (LZMA).
Sorry, you misunderstood. The obvious speed penalty comes from unaligned accesses. Some architectures can't even handle unaligned accesses at all.
Uhm, once the LAR header is aligned (it is), and of a size multiple the alignment size, the data is aligned, too - right?
So you propose to handle some executable code (bootblock, raminit) just with LAR and not with SELF? How are you going to explain that concept a few years down the road?
That code doesn't need a BSS section, in case of the bootblock it doesn't even need an entry point (as far as I can see). So we actually have two classes of code here: - initialization - upper level
So yes, just adding the pre-stage6 stuff as simple binary seems totally obvious to me, if stage6 is the first one to carry a SELF parser.
Wrong. Checksums say nothing about corrupt internal SELF structure. The LAR parser has the ability (well, at least in theory) to check for overlapping archive members right now. With SELF, a corrupt SELF header can reference arbitrary places in the archive and a pure LAR parser can't find that out because it does not parse SELF by design.
What kind of corruption do you refer to? The one that happens before the file it put in the LAR? (so the LAR checksum matches) There shouldn't be any, if the SELF creator works right. The one that happens later should be caught by LAR's checksum.
And just like the SELF creator, the LAR creator could have a bug and create broken structures.
And since a pure LAR parser does not understand SELF, you need a combined SELF+LAR parser to know whether the ROM has a chance to boot.
Does your parser also check if the bootblock starts with RET? Or solve the halting problem?
The bootblock and initram are executable files. They should be contained in SELF as well (and that means you should be able to state that the code is XIP). Tables which should be loaded to a given address in memory
the bootblock isn't loaded by a SELF parser - even if it gets a suitable header, it would implicitely be XIP.
Regards, Patrick
On 12.04.2008 15:55, Patrick Georgi wrote:
Am Samstag, den 12.04.2008, 13:16 +0200 schrieb Carl-Daniel Hailfinger:
SELF is missing a checksum for each uncompressed segment.
It has a checksum for the whole file (in LAR), isn't that enough?
That depends on whether you want to catch corruption during decompression.
SELF is missing a per-section compression algorithm specifier and if you introduce one, you have a compression algorithm specifier both in LAR and in SELF. If you remove it from LAR, you have no way to store
BSS is defined to be the implicit-RLE-0 compressor. Everything else probably defaults to a single one (LZMA).
I don't know whether we really want to make building an uncompressed image impossible.
Sorry, you misunderstood. The obvious speed penalty comes from unaligned accesses. Some architectures can't even handle unaligned accesses at all.
Uhm, once the LAR header is aligned (it is), and of a size multiple the alignment size, the data is aligned, too - right?
The LAR header is aligned, the SELF headers are aligned as well, but the individual SELF chunks containing LZMA headers are not.
So you propose to handle some executable code (bootblock, raminit) just with LAR and not with SELF? How are you going to explain that concept a few years down the road?
That code doesn't need a BSS section, in case of the bootblock it doesn't even need an entry point (as far as I can see). So we actually have two classes of code here:
- initialization
- upper level
So yes, just adding the pre-stage6 stuff as simple binary seems totally obvious to me, if stage6 is the first one to carry a SELF parser.
Sorry, what is stage 6?
Wrong. Checksums say nothing about corrupt internal SELF structure. The LAR parser has the ability (well, at least in theory) to check for overlapping archive members right now. With SELF, a corrupt SELF header can reference arbitrary places in the archive and a pure LAR parser can't find that out because it does not parse SELF by design.
What kind of corruption do you refer to? The one that happens before the file it put in the LAR? (so the LAR checksum matches) There shouldn't be any, if the SELF creator works right.
If the SELF creator works right. Exactly. If it produces random garbage, LAR has no way to know.
The one that happens later should be caught by LAR's checksum.
And just like the SELF creator, the LAR creator could have a bug and create broken structures.
Indeed. However, right now a LAR verifier could find all broken structures. With the new design, we would need a LAR+SELF verifier.
And since a pure LAR parser does not understand SELF, you need a combined SELF+LAR parser to know whether the ROM has a chance to boot.
Does your parser also check if the bootblock starts with RET? Or solve the halting problem?
No, but it can make sure that accessing any LAR member will not try to reference memory outside the LAR member. Accessing a SELF chunk can easily reference memory outside the LAR member and even outside the archive.
The bootblock and initram are executable files. They should be contained in SELF as well (and that means you should be able to state that the code is XIP). Tables which should be loaded to a given address in memory
the bootblock isn't loaded by a SELF parser - even if it gets a suitable header, it would implicitely be XIP.
OK, agreed for the bootblock. But how do you mark initram XIP?
Regards, Carl-Daniel
Am 12.04.2008 um 07:52 schrieb Carl-Daniel Hailfinger <c-d.hailfinger.devel.2006@gmx.net
:
On 12.04.2008 15:55, Patrick Georgi wrote:
Am Samstag, den 12.04.2008, 13:16 +0200 schrieb Carl-Daniel Hailfinger:
SELF is missing a checksum for each uncompressed segment.
It has a checksum for the whole file (in LAR), isn't that enough?
That depends on whether you want to catch corruption during decompression.
This is bogus, sorry.
SELF is missing a per-section compression algorithm specifier and if you introduce one, you have a compression algorithm specifier both in LAR and in SELF. If you remove it from LAR, you have no way to store
BSS is defined to be the implicit-RLE-0 compressor. Everything else probably defaults to a single one (LZMA).
I don't know whether we really want to make building an uncompressed image impossible.
No, instead we make this stuff well defined.
Sorry, you misunderstood. The obvious speed penalty comes from unaligned accesses. Some architectures can't even handle unaligned accesses at all.
Uhm, once the LAR header is aligned (it is), and of a size multiple the alignment size, the data is aligned, too - right?
The LAR header is aligned, the SELF headers are aligned as well, but the individual SELF chunks containing LZMA headers are not.
Wrong. Besides at least nrv2b is byte streaming.
So you propose to handle some executable code (bootblock, raminit) just with LAR and not with SELF? How are you going to explain that concept a few years down the road?
That code doesn't need a BSS section, in case of the bootblock it doesn't even need an entry point (as far as I can see). So we actually have two classes of code here:
- initialization
- upper level
So yes, just adding the pre-stage6 stuff as simple binary seems totally obvious to me, if stage6 is the first one to carry a SELF parser.
Sorry, what is stage 6?
Wrong. Checksums say nothing about corrupt internal SELF structure. The LAR parser has the ability (well, at least in theory) to check for overlapping archive members right now. With SELF, a corrupt SELF header can reference arbitrary places in the archive and a pure LAR parser can't find that out because it does not parse SELF by design.
What kind of corruption do you refer to? The one that happens before the file it put in the LAR? (so the LAR checksum matches) There shouldn't be any, if the SELF creator works right.
If the SELF creator works right. Exactly. If it produces random garbage, LAR has no way to know.
That's a bogus reason. The current code could just as easily produce garbage and checksum it right. Making per segment checksums does not improve the situation at all.
The one that happens later should be caught by LAR's checksum.
And just like the SELF creator, the LAR creator could have a bug and create broken structures.
Indeed. However, right now a LAR verifier could find all broken structures. With the new design, we would need a LAR+SELF verifier.
Not at all. LAR takes care of the integrity. That's what i designed it to do.
And since a pure LAR parser does not understand SELF, you need a combined SELF+LAR parser to know whether the ROM has a chance to boot.
Does your parser also check if the bootblock starts with RET? Or solve the halting problem?
No, but it can make sure that accessing any LAR member will not try to reference memory outside the LAR member. Accessing a SELF chunk can easily reference memory outside the LAR member and even outside the archive.
No because that would mean the file is corrupt and the lar checksum would catch that.
The bootblock and initram are executable files. They should be contained in SELF as well (and that means you should be able to state that the code is XIP). Tables which should be loaded to a given address in memory
the bootblock isn't loaded by a SELF parser - even if it gets a suitable header, it would implicitely be XIP.
OK, agreed for the bootblock. But how do you mark initram XIP?
Initram is not segments now either.you know its xip. To lar its just a hunk of data
Hi Stefan,
since you seem to disagree violently with my conclusions although most of my conclusions are supported by the SELF wiki page, the only explanation is that you disagree with the SELF wiki page as well.
On 12.04.2008 20:44, Stefan Reinauer wrote:
Am 12.04.2008 um 07:52 schrieb Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net:
On 12.04.2008 15:55, Patrick Georgi wrote:
Am Samstag, den 12.04.2008, 13:16 +0200 schrieb Carl-Daniel Hailfinger:
SELF is missing a per-section compression algorithm specifier and if you introduce one, you have a compression algorithm specifier both in LAR and in SELF. If you remove it from LAR, you have no way to store
BSS is defined to be the implicit-RLE-0 compressor. Everything else probably defaults to a single one (LZMA).
I don't know whether we really want to make building an uncompressed image impossible.
No, instead we make this stuff well defined.
Let me quote the current SELF wiki page: "CODE and DATA sections _may_ be compressed". Without a way to find out whether these sections are compressed, compression is totally undefined. Feel free to edit the wiki page to make this well-defined.
Sorry, you misunderstood. The obvious speed penalty comes from unaligned accesses. Some architectures can't even handle unaligned accesses at all.
Uhm, once the LAR header is aligned (it is), and of a size multiple the alignment size, the data is aligned, too - right?
The LAR header is aligned, the SELF headers are aligned as well, but the individual SELF chunks containing LZMA headers are not.
Wrong. Besides at least nrv2b is byte streaming.
Let me quote the current wiki page again: "The data section immediately follows the final entry in the segment table. Each block of segment data is written sequentially in this section." There is no mentioning of any alignment of the LZMA header at all and the "written sequentially" suggests there are no gaps. Please correct the wiki page if you disagree with it.
Wrong. Checksums say nothing about corrupt internal SELF structure. The LAR parser has the ability (well, at least in theory) to check for overlapping archive members right now. With SELF, a corrupt SELF header can reference arbitrary places in the archive and a pure LAR parser can't find that out because it does not parse SELF by design.
What kind of corruption do you refer to? The one that happens before the file it put in the LAR? (so the LAR checksum matches) There shouldn't be any, if the SELF creator works right.
If the SELF creator works right. Exactly. If it produces random garbage, LAR has no way to know.
That's a bogus reason. The current code could just as easily produce garbage and checksum it right. Making per segment checksums does not improve the situation at all.
My point here was not about checksums. LAR currently treats ELF files as binary blobs if you don't preparse them. AFAICS Ron wanted the preparsing as sort of a security mechanism to determine whether the ELF looks like a real ELF file. If SELF is treated as a blob by LAR we lose that advantage.
The one that happens later should be caught by LAR's checksum.
And just like the SELF creator, the LAR creator could have a bug and create broken structures.
Indeed. However, right now a LAR verifier could find all broken structures. With the new design, we would need a LAR+SELF verifier.
Not at all. LAR takes care of the integrity. That's what i designed it to do.
Does it take care of the integrity of each SELF file as well?
And since a pure LAR parser does not understand SELF, you need a combined SELF+LAR parser to know whether the ROM has a chance to boot.
Does your parser also check if the bootblock starts with RET? Or solve the halting problem?
No, but it can make sure that accessing any LAR member will not try to reference memory outside the LAR member. Accessing a SELF chunk can easily reference memory outside the LAR member and even outside the archive.
No because that would mean the file is corrupt and the lar checksum would catch that.
Scenario: You unpack a LAR. The extracted SELF file becomes corrupted somehow. You pack it into another LAR. That LAR now has a corrupt SELF file and you won't notice this during LAR creation.
The bootblock and initram are executable files. They should be contained in SELF as well (and that means you should be able to state that the code is XIP). Tables which should be loaded to a given address in memory
the bootblock isn't loaded by a SELF parser - even if it gets a suitable header, it would implicitely be XIP.
OK, agreed for the bootblock. But how do you mark initram XIP?
Initram is not segments now either.you know its xip. To lar its just a hunk of data
OK.
Regards, Carl-Daniel
Carl-Daniel Hailfinger wrote:
Hi Stefan,
since you seem to disagree violently with my conclusions although most of my conclusions are supported by the SELF wiki page, the only explanation is that you disagree with the SELF wiki page as well.
Not at all.
Wrong. Besides at least nrv2b is byte streaming.
Let me quote the current wiki page again: "The data section immediately follows the final entry in the segment table. Each block of segment data is written sequentially in this section." There is no mentioning of any alignment of the LZMA header at all and the "written sequentially" suggests there are no gaps. Please correct the wiki page if you disagree with it.
The wiki page did not mention it was complete either.
I am sorry it is not yet fool-proof enough for you to understand the goals of what we were working on. _Please_ stop nagging out of principle in this thread. All our compression algorithms are actually byte based, so you are completely discussing non-issues here. I wonder why you do this, because as far as I know, you were the one making the original lzma patch for coreboot - which unfortunately increased stack requirements by more than 16k. Something that will likely cause corruption on quite a number of boards. grep for "default STACK_SIZE" in the source tree. So let's please focus on fixing the actual issues possibly causing corruption before we discuss things that do not affect us.
Not at all. LAR takes care of the integrity. That's what i designed it to do.
Does it take care of the integrity of each SELF file as well?
Yes. As good as it does for any other file.
No because that would mean the file is corrupt and the lar checksum would catch that.
Scenario: You unpack a LAR. The extracted SELF file becomes corrupted somehow. You pack it into another LAR. That LAR now has a corrupt SELF file and you won't notice this during LAR creation.
Then you want to make SELF an exception here, compared to any other files in the LAR? How is corruption on disk not affecting the bootblock or any other file in the LAR, if you unpack it? Please, in your further discussion, also keep in mind that the current incarnation of LAR, besides all its broken section handling, also does not cover what you are talking about.
We did not try to catch disk errors during development. Neither did we try to catch mis-compilations or programming errors in our own code during runtime. Those are indeed outside of the scope and trying to catch them at runtime does not make a whole lot of sense.
Stefan
On Sat, Apr 12, 2008 at 01:16:00PM +0200, Carl-Daniel Hailfinger wrote:
On 12.04.2008 10:28, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
NAK. This design is so unfinished it is not even funny. Hint: Certain fields in the LAR header are there for a reason (guess which).
Yes, and because that concept is so incredibly broken beyond belief, we fixed it up at the summit. Those completely archive-unrelated fields which just sneaked into lar because we did not do our reviews thoroughly enough will finally go away again. Way to go.
SELF is missing a checksum for each uncompressed segment.
Is that really needed? That means we expect the decompression to fail silently.
If LAR provides a reliable transport then why add another checksum?
SELF is missing a per-section compression algorithm specifier
Does it need one? Either the entire SELF is compressed, or not. Again, LAR does the work. No?
Sorry, you misunderstood. The obvious speed penalty comes from unaligned accesses. Some architectures can't even handle unaligned accesses at all.
Can't this be solved generically in code like in memcpy_helper() ?
The concept of PIC is missing completely.
Because it is not needed.
So you propose to handle some executable code (bootblock, raminit) just with LAR and not with SELF? How are you going to explain that concept a few years down the road?
KISS. Since the early LAR files are simple binaries they don't need to be SELF.
A LAR parser can't figure out if the archive is corrupt.
Plain wrong. There's checksums in the lar, and that wont change.
Wrong. Checksums say nothing about corrupt internal SELF structure. The LAR parser has the ability (well, at least in theory) to check for overlapping archive members right now. With SELF, a corrupt SELF header
First of all, how would it end up corrupt? It must then be corrupt upon SELF creation, because if it changed in LAR, the LAR checksum would not match anymore.
can reference arbitrary places in the archive and a pure LAR parser can't find that out because it does not parse SELF by design.
Why does this matter, in practise? More than one SELF will never be up in the air. And so what if it references locations in the LAR?
Ron once stated the ability to figure out whether the ROM is corrupt before you flash it is one of the key features of LAR.
And since a pure LAR parser does not understand SELF, you need a combined SELF+LAR parser to know whether the ROM has a chance to boot.
What would be checked in the SELF? Isn't it good enough that there is a SELF with the right name?
I think it is impossible to reliably determine boot success ahead of time. Only trying to boot will show the actual outcome, especially since we want to reflash individual LAR files. I guess I disagree with Ron's statement. LAR is very nice, but it can't predict the future. :)
Instead the coreboot panic room will be the key feature in this domain. The fact that you don't just get beeps from a speaker but an actual way into your system regardless.
LAR is an archiver of arbitrary(!) files, SELF is the container for executable files. Mixing these two up was the biggest mistake in the history of the young LAR, making things irreversible and fragile.
Word.
The bootblock and initram are executable files. They should be contained in SELF as well
I don't think that is neccessary.
Tables which should be loaded to a given address in memory are by definition NOT code, yet you have to supply a load address
Hm, what kind of tables? What would this be that is just copied from LAR using some generic code rather than code that knows about the neccessary adress already?
Unfortunately, the current SELF+LAR proposal is not able to keep the design simple and still perform all tasks we currently use LAR for.
This is where I'm lost. Please explain what would not work anymore?
//Peter
On 12.04.2008 16:14, Peter Stuge wrote:
On Sat, Apr 12, 2008 at 01:16:00PM +0200, Carl-Daniel Hailfinger wrote:
On 12.04.2008 10:28, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
NAK. This design is so unfinished it is not even funny. Hint: Certain fields in the LAR header are there for a reason (guess which).
Yes, and because that concept is so incredibly broken beyond belief, we fixed it up at the summit. Those completely archive-unrelated fields which just sneaked into lar because we did not do our reviews thoroughly enough will finally go away again. Way to go.
SELF is missing a checksum for each uncompressed segment.
Is that really needed? That means we expect the decompression to fail silently.
That could happen.
If LAR provides a reliable transport then why add another checksum?
Memory corruption during decompression and miscompilation of the decompression code come to mind.
SELF is missing a per-section compression algorithm specifier
Does it need one? Either the entire SELF is compressed, or not.
Uh, no. The SELF design says that only some SELF segments may be compressed, others must not be compressed.
Again, LAR does the work. No?
See above. I wouldn't have complained that much if you were right.
Sorry, you misunderstood. The obvious speed penalty comes from unaligned accesses. Some architectures can't even handle unaligned accesses at all.
Can't this be solved generically in code like in memcpy_helper() ?
Are you willing to copy the lzma header for every lzma-compressed SELF segment to some reserved place in RAM? Are you willing to rewrite the decompression code so that it can handle a detached lzma header?
The concept of PIC is missing completely.
Because it is not needed.
So you propose to handle some executable code (bootblock, raminit) just with LAR and not with SELF? How are you going to explain that concept a few years down the road?
KISS. Since the early LAR files are simple binaries they don't need to be SELF.
raminit needs an entry point, so it has to be SELF (unless you propose that LAR and SELF both specify an entry point).
A LAR parser can't figure out if the archive is corrupt.
Plain wrong. There's checksums in the lar, and that wont change.
Wrong. Checksums say nothing about corrupt internal SELF structure. The LAR parser has the ability (well, at least in theory) to check for overlapping archive members right now. With SELF, a corrupt SELF header
First of all, how would it end up corrupt? It must then be corrupt upon SELF creation, because if it changed in LAR, the LAR checksum would not match anymore.
See my other mail in response to Patrick.
can reference arbitrary places in the archive and a pure LAR parser can't find that out because it does not parse SELF by design.
Why does this matter, in practise? More than one SELF will never be up in the air. And so what if it references locations in the LAR?
A LAR archive can have multiple SELFs and each of them can reference arbitrary memory and the LAR parser has no way to check that. A SELF parser could check that, but then it would have to know about the LAR structure which is a layering violation.
Ron once stated the ability to figure out whether the ROM is corrupt before you flash it is one of the key features of LAR.
And since a pure LAR parser does not understand SELF, you need a combined SELF+LAR parser to know whether the ROM has a chance to boot.
What would be checked in the SELF? Isn't it good enough that there is a SELF with the right name?
Currently, the ELF-to-LAR parser has a reasonable chance to find out whether the structure of the ELF is crap. If LAR simply handles a SELF as opaque object, it will not notice at all if the SELF is crap.
I think it is impossible to reliably determine boot success ahead of time. Only trying to boot will show the actual outcome, especially since we want to reflash individual LAR files. I guess I disagree with Ron's statement. LAR is very nice, but it can't predict the future. :)
What? And I trusted its baseball result predictions for 2009 ;-)
Instead the coreboot panic room will be the key feature in this domain. The fact that you don't just get beeps from a speaker but an actual way into your system regardless.
Are we guaranteed to reach the panic room if the SELF points to arbitrary memory?
LAR is an archiver of arbitrary(!) files, SELF is the container for executable files. Mixing these two up was the biggest mistake in the history of the young LAR, making things irreversible and fragile.
Word.
The bootblock and initram are executable files. They should be contained in SELF as well
I don't think that is neccessary.
initram needs an entry point. Storing an entry point in LAR and in SELF is really bad design.
Tables which should be loaded to a given address in memory are by definition NOT code, yet you have to supply a load address
Hm, what kind of tables? What would this be that is just copied from LAR using some generic code rather than code that knows about the neccessary adress already?
Hm. I know of no such table yet. What about option ROMs?
Unfortunately, the current SELF+LAR proposal is not able to keep the design simple and still perform all tasks we currently use LAR for.
This is where I'm lost. Please explain what would not work anymore?
I hope I listed most of my concerns above.
Regards, Carl-Daniel
Am 12.04.2008 um 08:06 schrieb Carl-Daniel Hailfinger <c-d.hailfinger.devel.2006@gmx.net
:
On 12.04.2008 16:14, Peter Stuge wrote:
On Sat, Apr 12, 2008 at 01:16:00PM +0200, Carl-Daniel Hailfinger wrote:
On 12.04.2008 10:28, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
NAK. This design is so unfinished it is not even funny. Hint: Certain fields in the LAR header are there for a reason (guess which).
Yes, and because that concept is so incredibly broken beyond belief, we fixed it up at the summit. Those completely archive-unrelated fields which just sneaked into lar because we did not do our reviews thoroughly enough will finally go away again. Way to go.
SELF is missing a checksum for each uncompressed segment.
Is that really needed? That means we expect the decompression to fail silently.
That could happen.
If LAR provides a reliable transport then why add another checksum?
Memory corruption during decompression and miscompilation of the decompression code come to mind.
You want to catch compiler bugs during runtime? Come on let's get serious again and deal with the real world.
SELF is missing a per-section compression algorithm specifier
Does it need one? Either the entire SELF is compressed, or not.
Uh, no. The SELF design says that only some SELF segments may be compressed, others must not be compressed.
Right. Using the lar specified compression algorithm. So?
Keep in mind that we handle segments much differently than xip and blobs already.
Again, LAR does the work. No?
See above. I wouldn't have complained that much if you were right.
Ehem
Sorry, you misunderstood. The obvious speed penalty comes from unaligned accesses. Some architectures can't even handle unaligned accesses at all.
Can't this be solved generically in code like in memcpy_helper() ?
Are you willing to copy the lzma header for every lzma-compressed SELF segment to some reserved place in RAM? Are you willing to rewrite the decompression code so that it can handle a detached lzma header?
The concept of PIC is missing completely.
Because it is not needed.
So you propose to handle some executable code (bootblock, raminit) just with LAR and not with SELF? How are you going to explain that concept a few years down the road?
KISS. Since the early LAR files are simple binaries they don't need to be SELF.
raminit needs an entry point, so it has to be SELF (unless you propose that LAR and SELF both specify an entry point).
At some point the entry point to a blob was at the beginning of the file. Adding the flexibility to begin with moved complexity from compile time to run time -- guess what, that was a bad idea.
A LAR parser can't figure out if the archive is corrupt.
Plain wrong. There's checksums in the lar, and that wont change.
Wrong. Checksums say nothing about corrupt internal SELF structure. The LAR parser has the ability (well, at least in theory) to check for overlapping archive members right now. With SELF, a corrupt SELF header
First of all, how would it end up corrupt? It must then be corrupt upon SELF creation, because if it changed in LAR, the LAR checksum would not match anymore.
See my other mail in response to Patrick.
can reference arbitrary places in the archive and a pure LAR parser can't find that out because it does not parse SELF by design.
Why does this matter, in practise? More than one SELF will never be up in the air. And so what if it references locations in the LAR?
A LAR archive can have multiple SELFs and each of them can reference arbitrary memory and the LAR parser has no way to check that. A SELF parser could check that, but then it would have to know about the LAR structure which is a layering violation.
What problem are you trying to catch?right now this is just artificial
Ron once stated the ability to figure out whether the ROM is corrupt before you flash it is one of the key features of LAR.
And since a pure LAR parser does not understand SELF, you need a combined SELF+LAR parser to know whether the ROM has a chance to boot.
What would be checked in the SELF? Isn't it good enough that there is a SELF with the right name?
Currently, the ELF-to-LAR parser has a reasonable chance to find out whether the structure of the ELF is crap. If LAR simply handles a SELF as opaque object, it will not notice at all if the SELF is crap.
in the real world this is not true.
I think it is impossible to reliably determine boot success ahead of time. Only trying to boot will show the actual outcome, especially since we want to reflash individual LAR files. I guess I disagree with Ron's statement. LAR is very nice, but it can't predict the future. :)
What? And I trusted its baseball result predictions for 2009 ;-)
Instead the coreboot panic room will be the key feature in this domain. The fact that you don't just get beeps from a speaker but an actual way into your system regardless.
Are we guaranteed to reach the panic room if the SELF points to arbitrary memory?
Beause of what?compiler bugs? In that case the miscompiled panic room will not help us.
LAR is an archiver of arbitrary(!) files, SELF is the container for executable files. Mixing these two up was the biggest mistake in the history of the young LAR, making things irreversible and fragile.
Word.
The bootblock and initram are executable files. They should be contained in SELF as well
I don't think that is neccessary.
initram needs an entry point. Storing an entry point in LAR and in SELF is really bad design.
Wrong and right.
Tables which should be loaded to a given address in memory are by definition NOT code, yet you have to supply a load address
Hm, what kind of tables? What would this be that is just copied from LAR using some generic code rather than code that knows about the neccessary adress already?
Hm. I know of no such table yet. What about option ROMs?
They are handled already
Unfortunately, the current SELF+LAR proposal is not able to keep the design simple and still perform all tasks we currently use LAR for.
This is where I'm lost. Please explain what would not work anymore?
I hope I listed most of my concerns above.
Regards, Carl-Daniel
-- coreboot mailing list coreboot@coreboot.org http://www.coreboot.org/mailman/listinfo/coreboot
@Peter: One apology for you near 2/3 of the mail. @Segher: One gcc question for you near the end of the mail.
On 12.04.2008 21:07, Stefan Reinauer wrote:
Am 12.04.2008 um 08:06 schrieb Carl-Daniel Hailfinger:
On 12.04.2008 16:14, Peter Stuge wrote:
On Sat, Apr 12, 2008 at 01:16:00PM +0200, Carl-Daniel Hailfinger wrote:
On 12.04.2008 10:28, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
NAK. This design is so unfinished it is not even funny. Hint: Certain fields in the LAR header are there for a reason (guess which).
Yes, and because that concept is so incredibly broken beyond belief, we fixed it up at the summit. Those completely archive-unrelated fields which just sneaked into lar because we did not do our reviews thoroughly enough will finally go away again. Way to go.
SELF is missing a checksum for each uncompressed segment.
Is that really needed? That means we expect the decompression to fail silently.
That could happen.
If LAR provides a reliable transport then why add another checksum?
Memory corruption during decompression and miscompilation of the decompression code come to mind
You want to catch compiler bugs during runtime? Come on let's get serious again and deal with the real world.
Some Linux kernel verification mechanisms have caught compiler bugs in the past. That's why I considered the possibility to do this as well.
SELF is missing a per-section compression algorithm specifier
Does it need one? Either the entire SELF is compressed, or not.
Uh, no. The SELF design says that only some SELF segments may be compressed, others must not be compressed.
Right. Using the lar specified compression algorithm. So?
Layering violation, but doable.
Keep in mind that we handle segments much differently than xip and blobs already.
OK.
Again, LAR does the work. No?
See above. I wouldn't have complained that much if you were right
Ehem
@Peter: Sorry, this came across in a way that was unintended. Let me try to rephrase: I would not have complained if the SELF wiki page had agreed with Peter (because I agree with many of his points). However, I understood the wiki page as reference spec and that meant Peter was conflicting ("wrong") with it. I prefer Peter's and Stefan's opinion over the "spec" in the wiki page any time and hope both of them will edit the wiki page to match their ideas about SELF.
The concept of PIC is missing completely.
Because it is not needed.
So you propose to handle some executable code (bootblock, raminit) just with LAR and not with SELF? How are you going to explain that concept a few years down the road?
KISS. Since the early LAR files are simple binaries they don't need to be SELF.
raminit needs an entry point, so it has to be SELF (unless you propose that LAR and SELF both specify an entry point).
At some point the entry point to a blob was at the beginning of the file. Adding the flexibility to begin with moved complexity from compile time to run time -- guess what, that was a bad idea.
Unless you fork gcc, having the entry point at the beginning of initram is not possible. It worked by accident in the past and I'm pretty sure Segher will confirm that we can't rely on that.
A LAR parser can't figure out if the archive is corrupt.
Plain wrong. There's checksums in the lar, and that wont change.
Wrong. Checksums say nothing about corrupt internal SELF structure. The LAR parser has the ability (well, at least in theory) to check for overlapping archive members right now. With SELF, a corrupt SELF header
First of all, how would it end up corrupt? It must then be corrupt upon SELF creation, because if it changed in LAR, the LAR checksum would not match anymore.
See my other mail in response to Patrick.
can reference arbitrary places in the archive and a pure LAR parser can't find that out because it does not parse SELF by design.
Why does this matter, in practise? More than one SELF will never be up in the air. And so what if it references locations in the LAR?
A LAR archive can have multiple SELFs and each of them can reference arbitrary memory and the LAR parser has no way to check that. A SELF parser could check that, but then it would have to know about the LAR structure which is a layering violation.
What problem are you trying to catch?right now this is just artificial
The extract SELF from LAR, corrupt SELF, add SELF to another LAR scenario from my other mail (~10 minutes ago).
Ron once stated the ability to figure out whether the ROM is corrupt before you flash it is one of the key features of LAR.
And since a pure LAR parser does not understand SELF, you need a combined SELF+LAR parser to know whether the ROM has a chance to boot.
What would be checked in the SELF? Isn't it good enough that there is a SELF with the right name?
Currently, the ELF-to-LAR parser has a reasonable chance to find out whether the structure of the ELF is crap. If LAR simply handles a SELF as opaque object, it will not notice at all if the SELF is crap.
in the real world this is not true.
What's not true in the real world? Will LAR have knowledge of SELF and parse/check/verify/etc it?
I think it is impossible to reliably determine boot success ahead of time. Only trying to boot will show the actual outcome, especially since we want to reflash individual LAR files. I guess I disagree with Ron's statement. LAR is very nice, but it can't predict the future. :)
What? And I trusted its baseball result predictions for 2009 ;-)
Instead the coreboot panic room will be the key feature in this domain. The fact that you don't just get beeps from a speaker but an actual way into your system regardless.
Are we guaranteed to reach the panic room if the SELF points to arbitrary memory?
Beause of what?compiler bugs? In that case the miscompiled panic room will not help us.
Because of the scenario above.
LAR is an archiver of arbitrary(!) files, SELF is the container for executable files. Mixing these two up was the biggest mistake in the history of the young LAR, making things irreversible and fragile.
Word.
The bootblock and initram are executable files. They should be contained in SELF as well
I don't think that is neccessary.
initram needs an entry point. Storing an entry point in LAR and in SELF is really bad design.
Wrong
For the initram entry point discussion see my response above.
and right.
That means we have to store the entry point for initram somewhere.
Tables which should be loaded to a given address in memory are by definition NOT code, yet you have to supply a load address
Hm, what kind of tables? What would this be that is just copied from LAR using some generic code rather than code that knows about the neccessary adress already?
Hm. I know of no such table yet. What about option ROMs?
They are handled already
As SELF or as simple LAR member?
Regards, Carl-Daniel
Some Linux kernel verification mechanisms have caught compiler bugs in the past. That's why I considered the possibility to do this as well.
The only ones that I am aware of are *generic* sanity checks in the kernel, that manage to trip on something that turned out to be a compiler bug. Adding specific sanity checks for every piece of code that only checks for compiler bugs is foolishness (hint: it's impossible to ever get even 1% cover; and that check code itself is run through that same compiler anyway!)
If there is a *specific* compiler bug that you are aware of, it is nicer to catch it at build time, if possible, or just refuse to build with the broken compiler.
At some point the entry point to a blob was at the beginning of the file. Adding the flexibility to begin with moved complexity from compile time to run time -- guess what, that was a bad idea.
Unless you fork gcc, having the entry point at the beginning of initram is not possible. It worked by accident in the past and I'm pretty sure Segher will confirm that we can't rely on that.
No, I confirm you *can* rely on that: just set up your linker script and your crt.o so this works. Something like
.start : { crt0.o(*) } .text : { *(.text) } .data : { *(.data) }
will guarantee that all of crt0 ends up before anything else (and there is no need to make it a separate section, FWIW).
Segher
On 13.04.2008 00:59, Segher Boessenkool wrote:
Some Linux kernel verification mechanisms have caught compiler bugs in the past. That's why I considered the possibility to do this as well.
The only ones that I am aware of are *generic* sanity checks in the kernel, that manage to trip on something that turned out to be a compiler bug. Adding specific sanity checks for every piece of code that only checks for compiler bugs is foolishness (hint: it's impossible to ever get even 1% cover; and that check code itself is run through that same compiler anyway!)
Well, possibly catching compiler bugs would have been a nice (yet improbable) side effect. The decompression of a large chunk of data is probably taxing memory a bit more than our current simple memtest implementation and checksumming the result may have uncovered bugs in RAM configuration. But we can postpone/skip that.
If there is a *specific* compiler bug that you are aware of, it is nicer to catch it at build time, if possible, or just refuse to build with the broken compiler.
At some point the entry point to a blob was at the beginning of the file. Adding the flexibility to begin with moved complexity from compile time to run time -- guess what, that was a bad idea.
Unless you fork gcc, having the entry point at the beginning of initram is not possible. It worked by accident in the past and I'm pretty sure Segher will confirm that we can't rely on that.
No, I confirm you *can* rely on that: just set up your linker script and your crt.o so this works. Something like
.start : { crt0.o(*) } .text : { *(.text) } .data : { *(.data) }
will guarantee that all of crt0 ends up before anything else (and there is no need to make it a separate section, FWIW).
Hm. Does that really apply to the initram binary we produce in v3? AFAIK we don't link any crt0 into that binary. Here are the commands used to generate coreboot.initram:
$(CC) $(INITCFLAGS) -D_SHARED -fPIE -c -combine $(INITRAM_OBJ) -o $(obj)/coreboot.initram_partiallylinked.o $(LD) -Ttext 0 --entry main -N -R $(obj)/stage0-prefixed.o $(obj)/coreboot.initram_partiallylinked.o -o $(obj)/coreboot.initram
The entry point(main) has to end up at byte 0 of coreboot.initram if we want to avoid storing it outside the file. Don't let yourself be confused by $(INITRAM_OBJ) - the variable refers to a few source files.
Regards, Carl-Daniel
Some Linux kernel verification mechanisms have caught compiler bugs in the past. That's why I considered the possibility to do this as well.
The only ones that I am aware of are *generic* sanity checks in the kernel, that manage to trip on something that turned out to be a compiler bug. Adding specific sanity checks for every piece of code that only checks for compiler bugs is foolishness (hint: it's impossible to ever get even 1% cover; and that check code itself is run through that same compiler anyway!)
Well, possibly catching compiler bugs would have been a nice (yet improbable) side effect. The decompression of a large chunk of data is probably taxing memory a bit more than our current simple memtest implementation and checksumming the result may have uncovered bugs in RAM configuration. But we can postpone/skip that.
IMHO opinion, if you want to test RAM, it is better to test RAM explicitly than to hope some simple checksum of <some random data> will catch this. It's also a lot cleaner design, of course.
Unless you fork gcc, having the entry point at the beginning of initram is not possible. It worked by accident in the past and I'm pretty sure Segher will confirm that we can't rely on that.
No, I confirm you *can* rely on that: just set up your linker script and your crt.o so this works. Something like
.start : { crt0.o(*) } .text : { *(.text) } .data : { *(.data) }
will guarantee that all of crt0 ends up before anything else (and there is no need to make it a separate section, FWIW).
Hm. Does that really apply to the initram binary we produce in v3?
I don't know what the current v3 code does. I'm just replying to your "cannot be done, it's GCC's fault", while GCC isn't even in the picture if your code is sane at all.
AFAIK we don't link any crt0 into that binary.
You have *some* equivalent code, for sure. Some startup code that initialises things to conform to the ABI environment expected by the compiler (initial stack frame, segments, GOT/TOC/PLT/TLS pointers, etc -- that kind of thing).
Here are the commands used to generate coreboot.initram:
$(CC) $(INITCFLAGS) -D_SHARED -fPIE -c -combine $(INITRAM_OBJ) -o $(obj)/coreboot.initram_partiallylinked.o $(LD) -Ttext 0 --entry main -N -R $(obj)/stage0-prefixed.o $(obj)/coreboot.initram_partiallylinked.o -o $(obj)/coreboot.initram
The entry point(main) has to end up at byte 0 of coreboot.initram if we want to avoid storing it outside the file.
That doesn't seem to be guaranteed by these commands, indeed.
Segher
On 13.04.2008 02:11, Segher Boessenkool wrote:
Some Linux kernel verification mechanisms have caught compiler bugs in the past. That's why I considered the possibility to do this as well.
The only ones that I am aware of are *generic* sanity checks in the kernel, that manage to trip on something that turned out to be a compiler bug. Adding specific sanity checks for every piece of code that only checks for compiler bugs is foolishness (hint: it's impossible to ever get even 1% cover; and that check code itself is run through that same compiler anyway!)
Well, possibly catching compiler bugs would have been a nice (yet improbable) side effect. The decompression of a large chunk of data is probably taxing memory a bit more than our current simple memtest implementation and checksumming the result may have uncovered bugs in RAM configuration. But we can postpone/skip that.
IMHO opinion, if you want to test RAM, it is better to test RAM explicitly than to hope some simple checksum of <some random data> will catch this. It's also a lot cleaner design, of course.
OK.
Unless you fork gcc, having the entry point at the beginning of initram is not possible. It worked by accident in the past and I'm pretty sure Segher will confirm that we can't rely on that.
No, I confirm you *can* rely on that: just set up your linker script and your crt.o so this works. Something like
.start : { crt0.o(*) } .text : { *(.text) } .data : { *(.data) }
will guarantee that all of crt0 ends up before anything else (and there is no need to make it a separate section, FWIW).
Hm. Does that really apply to the initram binary we produce in v3?
I don't know what the current v3 code does. I'm just replying to your "cannot be done, it's GCC's fault", while GCC isn't even in the picture if your code is sane at all.
In doubt, the code does not qualify for your definition of "sane".
AFAIK we don't link any crt0 into that binary.
You have *some* equivalent code, for sure. Some startup code that initialises things to conform to the ABI environment expected by the compiler (initial stack frame, segments, GOT/TOC/PLT/TLS pointers, etc -- that kind of thing).
Ah, that. That startup code ends up in a separate binary.
Here are the commands used to generate coreboot.initram:
$(CC) $(INITCFLAGS) -D_SHARED -fPIE -c -combine $(INITRAM_OBJ) -o $(obj)/coreboot.initram_partiallylinked.o $(LD) -Ttext 0 --entry main -N -R $(obj)/stage0-prefixed.o $(obj)/coreboot.initram_partiallylinked.o -o $(obj)/coreboot.initram
The entry point(main) has to end up at byte 0 of coreboot.initram if we want to avoid storing it outside the file.
That doesn't seem to be guaranteed by these commands, indeed.
Any way to force the first instruction of main() to be at byte 0 of the final binary when the startup (crt0) code is in a separate binary? That's what I meant with "having the entry point at the beginning of initram". Unless the linker reorders functions (without breaking all the PIC code), this means the code section generated by gcc has to begin with the compiled body of main() and I see no way to ensure that, especially when using the -combine parameter. If that is possible with current (and older) gcc, I'd appreciate a hint how to achieve this.
Regards, Carl-Daniel
The entry point(main) has to end up at byte 0 of coreboot.initram if we want to avoid storing it outside the file.
That doesn't seem to be guaranteed by these commands, indeed.
Any way to force the first instruction of main() to be at byte 0 of the final binary when the startup (crt0) code is in a separate binary?
Even if you do some of the startup administrativa elsewhere, you still have some startup code in your binary -- that is what "main" _is_, if you look at it from a certain angle. Or you can simply do a crt that does nothing more than declare the entry point (symbol _start, usually), and branch to main.
The compiler doesn't care about these things, and it shouldn't. This is linker land, and it's not a hard problem.
That's what I meant with "having the entry point at the beginning of initram". Unless the linker reorders functions (without breaking all the PIC code),
The linker will reorder stuff if you tell it to; and it will put stuff in the order you want, if you tell it to do _that_. Just tell it what you want, don't rely on defaults, and certainly don't rely on things that you aren't guaranteed in the linker documentation.
this means the code section generated by gcc has to begin with the compiled body of main() and I see no way to ensure that, especially when using the -combine parameter. If that is possible with current (and older) gcc, I'd appreciate a hint how to achieve this.
You could put main() in a separate section, and tell the linker to put that section first in the output binary. But there's no need to play games like that.
Segher
On 12/04/08 17:06 +0200, Carl-Daniel Hailfinger wrote:
SELF is missing a per-section compression algorithm specifier
Does it need one? Either the entire SELF is compressed, or not.
Uh, no. The SELF design says that only some SELF segments may be compressed, others must not be compressed.
Again, LAR does the work. No?
See above. I wouldn't have complained that much if you were right.
I'm going to address this, since it seems to be Carl-Daniel's main sticking point.
One of the minor details of a "chooser" menu is that we need to populate the menu with sane and descriptive names. We could just use the LAR "filename", but that is not very customer friendly. Thats where the NAME segment comes from. I think thats pretty clear to everybody.
I expect that when bayou first comes up in chooser mode, the first thing it is going to do is walk the LAR and locate all the payloads for its menu. While its doing that, its going to also read the names for each of the payloads. Forcing the entire SELF to decompress for this simple act costs memory and is needlessly slow. Compressing just the segment is silly (we're talking at most 15 to 20 bytes here), and it forces any payloads that want to examine the SELF files in the LAR to carry a de-compression alogrithm. I also extended the same behavior to the .notes section, because I deemed that the payload may wish to store configuration information there, which would be silly to copy to memory if we didn't need to.
The data and code segments will be compressed with the LAR compression mechanism, if it happens to be specified (I will go back and correct the wiki page to make that clearer). Yes, that does mean lzma headers per segment. I'm okay with that, I think its worth the cost.
Jordan
On 13.04.2008 00:48, Jordan Crouse wrote:
On 12/04/08 17:06 +0200, Carl-Daniel Hailfinger wrote:
SELF is missing a per-section compression algorithm specifier
Does it need one? Either the entire SELF is compressed, or not.
Uh, no. The SELF design says that only some SELF segments may be compressed, others must not be compressed.
Again, LAR does the work. No?
See above. I wouldn't have complained that much if you were right.
I'm going to address this, since it seems to be Carl-Daniel's main sticking point.
Thanks for changing the wiki page. The compression situation is now a lot more well-defined. Perhaps add something like "this segment has zero length in ROM so compression does not apply" to the segment description for BSS and ENTRY.
One of the minor details of a "chooser" menu is that we need to populate the menu with sane and descriptive names. We could just use the LAR "filename", but that is not very customer friendly. Thats where the NAME segment comes from. I think thats pretty clear to everybody.
I expect that when bayou first comes up in chooser mode, the first thing it is going to do is walk the LAR and locate all the payloads for its menu. While its doing that, its going to also read the names for each of the payloads. Forcing the entire SELF to decompress for this simple act costs memory and is needlessly slow. Compressing just the segment is silly (we're talking at most 15 to 20 bytes here), and it forces any payloads that want to examine the SELF files in the LAR to carry a de-compression alogrithm. I also extended the same behavior to the .notes section, because I deemed that the payload may wish to store configuration information there, which would be silly to copy to memory if we didn't need to.
OK.
The data and code segments will be compressed with the LAR compression mechanism, if it happens to be specified (I will go back and correct the wiki page to make that clearer). Yes, that does mean lzma headers per segment. I'm okay with that, I think its worth the cost.
Thanks. Can you add another format constraint? Each SELF segment should be aligned to an 8 byte boundary. That will resolve the unaligned access problem (and eliminate another one of my worries).
Regards, Carl-Daniel
Am 12.04.2008 um 07:14 schrieb Peter Stuge peter@stuge.se:
On Sat, Apr 12, 2008 at 01:16:00PM +0200, Carl-Daniel Hailfinger wrote:
On 12.04.2008 10:28, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
NAK. This design is so unfinished it is not even funny. Hint: Certain fields in the LAR header are there for a reason (guess which).
Yes, and because that concept is so incredibly broken beyond belief, we fixed it up at the summit. Those completely archive-unrelated fields which just sneaked into lar because we did not do our reviews thoroughly enough will finally go away again. Way to go.
SELF is missing a checksum for each uncompressed segment.
Is that really needed? That means we expect the decompression to fail silently.
If our algorithms are broken we should fix them and not try to workaround by adding in-memory checksums. A correct input file leads to correct output. If we stop assuming that the uncompressed copy is our least problem.
If LAR provides a reliable transport then why add another checksum?
Absolutely.
SELF is missing a per-section compression algorithm specifier
Does it need one? Either the entire SELF is compressed, or not. Again, LAR does the work. No?
Sorry, you misunderstood. The obvious speed penalty comes from unaligned accesses. Some architectures can't even handle unaligned accesses at all.
Can't this be solved generically in code like in memcpy_helper() ?
Or even in the decompression code
The concept of PIC is missing completely.
Because it is not needed.
So you propose to handle some executable code (bootblock, raminit) just with LAR and not with SELF? How are you going to explain that concept a few years down the road?
KISS. Since the early LAR files are simple binaries they don't need to be SELF.
A LAR parser can't figure out if the archive is corrupt.
Plain wrong. There's checksums in the lar, and that wont change.
Wrong. Checksums say nothing about corrupt internal SELF structure. The LAR parser has the ability (well, at least in theory) to check for overlapping archive members right now. With SELF, a corrupt SELF header
First of all, how would it end up corrupt? It must then be corrupt upon SELF creation, because if it changed in LAR, the LAR checksum would not match anymore.
can reference arbitrary places in the archive and a pure LAR parser can't find that out because it does not parse SELF by design.
Why does this matter, in practise? More than one SELF will never be up in the air. And so what if it references locations in the LAR?
Ron once stated the ability to figure out whether the ROM is corrupt before you flash it is one of the key features of LAR.
And since a pure LAR parser does not understand SELF, you need a combined SELF+LAR parser to know whether the ROM has a chance to boot.
What would be checked in the SELF? Isn't it good enough that there is a SELF with the right name?
I think it is impossible to reliably determine boot success ahead of time. Only trying to boot will show the actual outcome, especially since we want to reflash individual LAR files. I guess I disagree with Ron's statement. LAR is very nice, but it can't predict the future. :)
Instead the coreboot panic room will be the key feature in this domain. The fact that you don't just get beeps from a speaker but an actual way into your system regardless.
LAR is an archiver of arbitrary(!) files, SELF is the container for executable files. Mixing these two up was the biggest mistake in the history of the young LAR, making things irreversible and fragile.
Word.
The bootblock and initram are executable files. They should be contained in SELF as well
I don't think that is neccessary.
Tables which should be loaded to a given address in memory are by definition NOT code, yet you have to supply a load address
Hm, what kind of tables? What would this be that is just copied from LAR using some generic code rather than code that knows about the neccessary adress already?
Unfortunately, the current SELF+LAR proposal is not able to keep the design simple and still perform all tasks we currently use LAR for.
This is where I'm lost. Please explain what would not work anymore?
//Peter
-- coreboot mailing list coreboot@coreboot.org http://www.coreboot.org/mailman/listinfo/coreboot
Am 12.04.2008 um 04:16 schrieb Carl-Daniel Hailfinger <c-d.hailfinger.devel.2006@gmx.net
:
On 12.04.2008 10:28, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
NAK. This design is so unfinished it is not even funny. Hint: Certain fields in the LAR header are there for a reason (guess which).
Yes, and because that concept is so incredibly broken beyond belief, we fixed it up at the summit. Those completely archive-unrelated fields which just sneaked into lar because we did not do our reviews thoroughly enough will finally go away again. Way to go.
SELF is missing a checksum for each uncompressed segment.
Yes, on purpose. Because LAR takes care of that for the whole file. We don't want duplicate information in SELF.
SELF is missing a per-section compression algorithm specifier and if you introduce one...
Short answer: Not needed. The LAR compression type is good enough. Using different compression algorithms within a SELF is a bad idea anyways.
you have a compression algorithm specifier both in LAR and in SELF. If you remove it from LAR, you have no way to store compressed data in a LAR directly. You can work around that by wrapping every file with a SELF header, but then SELF becomes a generic file container and violates your statement that "SELF is the container for executable files".
There's also an obvious speed penalty for SELF (guess why).
The opposite. There's a speed win because we do less work in lar walks.
Sorry, you misunderstood. The obvious speed penalty comes from unaligned accesses. Some architectures can't even handle unaligned accesses at all.
And they will not happen more often than now.
The concept of PIC is missing completely.
Because it is not needed.
So you propose to handle some executable code (bootblock, raminit) just with LAR and not with SELF? How are you going to explain that concept a few years down the road?
Those are blobs. In a few years everyone will be happy we got rid of the current misconception, so no need to explain anything.
A LAR parser can't figure out if the archive is corrupt.
Plain wrong. There's checksums in the lar, and that wont change. Even here you will see a reasonable speed-up because you don't have to load 3 segments to memory before you find the 4th being corrupted.
Wrong. Checksums say nothing about corrupt internal SELF structure. The LAR parser has the ability (well, at least in theory) to check for overlapping archive members right now. With SELF, a corrupt SELF header can reference arbitrary places in the archive and a pure LAR parser can't find that out because it does not parse SELF by design.
Wrong. The SELF header is checksummed, too. If the SELF is not consistent don't load any of it.
Ron once stated the ability to figure out whether the ROM is corrupt before you flash it is one of the key features of LAR. SELF completely destroys that feature.
Not at all - The opposite: It makes lar finally more solid again. The feature of checksumming (could have) worked nicely with the very first version of lar that I wrote. Just the actual checking was not in place as it is a minor issue when you can't even boot.
And since a pure LAR parser does not understand SELF, you need a combined SELF+LAR parser to know whether the ROM has a chance to boot.
Yes and by this simplify the code in all cases. It removes all the special cases that we handle now and instead replaces them with a sane solution.
We might as well kill LAR completely and move to SELF only (and then SELF slowly will become a bad reinvention of LAR).
You are seriously misunderstanding the concepts of LAR and SELF. While LAR is an archiver of arbitrary(!) files, SELF is the container for executable files. Mixing these two up was the biggest mistake in the history of the young LAR, making things irreversible and fragile.
The bootblock and initram are executable files. They should be contained in SELF as well (and that means you should be able to state that the code is XIP). Tables which should be loaded to a given address in memory are by definition NOT code, yet you have to supply a load address (something that will only be available to SELF), so SELF is in no way "the" container for executable files, but a container for files which need at least part of them loaded to a given address.
When writing software, one should think about the problem that is going to be solved. The reason for unpacking segments was to get rid of the ELF loader and being able to streamload code. This has nothing to do with anything I designed LAR for and the fact that the implementation was married into LAR was obviously a foul hack.
I agree that the current LAR implementation and design is a one-size-fits-all compromise.
Just that it doesn't fit, because the concept is broken.
Unfortunately, the current SELF+LAR proposal is not able to keep the design simple and still perform all tasks we currently use LAR for.
Wrong.
I'm all for revising an existing design as long as the new design is better or equal in all aspects. LAR+SELF probably can be beaten into shape, but it won't resemble the current proposal.
Regards, Carl-Daniel
Hi,
sorry for interrupting, but I have short question, actually two.
2008/4/12, Stefan Reinauer stepan@coresystems.de: [...]
You are seriously misunderstanding the concepts of LAR and SELF. While LAR is an archiver of arbitrary(!) files, SELF is the container for executable files. Mixing these two up was the biggest mistake in the history of the young LAR, making things irreversible and fragile.
For the unexperienced like me: Why would anybody need either LAR or SELF? I kind of understand the reasoning behind LAR, but why would SELF be neccessary? Can't you just extract a regular ELF file from a LAR archive and then load it?
The reason for unpacking segments was to get rid of the ELF loader and being able to streamload code.
What do you mean by "streamload code" and why would you want to get rid of the ELF loader?
I'm just interested, I'm not trying to make a point.
Regards,
Phil.
Am Samstag, den 12.04.2008, 15:36 +0200 schrieb Philip Schulz:
For the unexperienced like me: Why would anybody need either LAR or SELF? I kind of understand the reasoning behind LAR, but why would SELF be neccessary? Can't you just extract a regular ELF file from a LAR archive and then load it?
Xou have to be prepared for the possibility that the most interesting header comes last (after all the data). If that ELF image is compressed, that means that you'll get to decompress everything (to somewhere), then read that header, then copy everything around once more (and deal with overlapping regions between your uncompressed copy and the final ELF layout).
Right now, the ELF parser simply requires that this header is at the beginning of the file, and fails otherwise - not nice, but "sane".
SELF promises proper handling of that, with a parser that is smaller, more simple and more obviously correct than the current ELF parser.
What do you mean by "streamload code" and why would you want to get rid of the ELF loader?
Streamloading code means the capability of reading the image (ELF, SELF, whatever) byte-by-byte, and being able to make sense of that without needing knowledge from information that comes later in the file.
ELF allows for flexibility we don't need, and that we don't want to cope with.
Regards, Patrick Georgi
Patrick,
thank you for your quick answer.
Regards,
Phil
2008/4/12, Patrick Georgi patrick@georgi-clan.de:
Am Samstag, den 12.04.2008, 15:36 +0200 schrieb Philip Schulz:
For the unexperienced like me: Why would anybody need either LAR or SELF? I kind of understand the reasoning behind LAR, but why would SELF be neccessary? Can't you just extract a regular ELF file from a LAR archive and then load it?
Xou have to be prepared for the possibility that the most interesting header comes last (after all the data). If that ELF image is compressed, that means that you'll get to decompress everything (to somewhere), then read that header, then copy everything around once more (and deal with overlapping regions between your uncompressed copy and the final ELF layout).
Right now, the ELF parser simply requires that this header is at the beginning of the file, and fails otherwise - not nice, but "sane".
SELF promises proper handling of that, with a parser that is smaller, more simple and more obviously correct than the current ELF parser.
What do you mean by "streamload code" and why would you want to get rid of the ELF loader?
Streamloading code means the capability of reading the image (ELF, SELF, whatever) byte-by-byte, and being able to make sense of that without needing knowledge from information that comes later in the file.
ELF allows for flexibility we don't need, and that we don't want to cope with.
Regards,
Patrick Georgi
On Sat, Apr 12, 2008 at 6:36 AM, Philip Schulz philip.s.schulz@googlemail.com wrote:
Hi,
For the unexperienced like me: Why would anybody need either LAR or SELF? I kind of understand the reasoning behind LAR, but why would SELF be neccessary? Can't you just extract a regular ELF file from a LAR archive and then load it?
ELF fails on several fronts. Elf is for execution and linking. It mixes two kinds of things in one file. You would think, with all the junk they put in there, that it would get it right. As we can see, they didn't even get the 64/32 thing right, which is amazing.
But let's suppose you have the ELF for /bin/cat: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .interp PROGBITS 08048114 000114 000013 00 A 0 0 1 [ 2] .note.ABI-tag NOTE 08048128 000128 000020 00 A 0 0 4 [ 3] .hash HASH 08048148 000148 00014c 04 A 5 0 4 [ 4] .gnu.hash GNU_HASH 08048294 000294 000030 04 A 5 0 4 [ 5] .dynsym DYNSYM 080482c4 0002c4 0002c0 10 A 6 1 4 [ 6] .dynstr STRTAB 08048584 000584 0001b3 00 A 0 0 1 [ 7] .gnu.version VERSYM 08048738 000738 000058 02 A 5 0 2 [ 8] .gnu.version_r VERNEED 08048790 000790 000060 00 A 6 1 4 [ 9] .rel.dyn REL 080487f0 0007f0 000020 08 A 5 0 4 [10] .rel.plt REL 08048810 000810 000138 08 A 5 12 4 [11] .init PROGBITS 08048948 000948 000030 00 AX 0 0 4 [12] .plt PROGBITS 08048978 000978 000280 04 AX 0 0 4 [13] .text PROGBITS 08048c00 000c00 0024c8 00 AX 0 0 16 [14] .fini PROGBITS 0804b0c8 0030c8 00001c 00 AX 0 0 4 [15] .rodata PROGBITS 0804b100 003100 000baf 00 A 0 0 32 [16] .eh_frame PROGBITS 0804bcb0 003cb0 000004 00 A 0 0 4 [17] .ctors PROGBITS 0804c000 004000 000008 00 WA 0 0 4 [18] .dtors PROGBITS 0804c008 004008 000008 00 WA 0 0 4 [19] .jcr PROGBITS 0804c010 004010 000004 00 WA 0 0 4 [20] .dynamic DYNAMIC 0804c014 004014 0000d0 08 WA 6 0 4 [21] .got PROGBITS 0804c0e4 0040e4 000008 04 WA 0 0 4 [22] .got.plt PROGBITS 0804c0ec 0040ec 0000a8 04 WA 0 0 4 [23] .data PROGBITS 0804c194 004194 00003c 00 WA 0 0 4 [24] .bss NOBITS 0804c1e0 0041d0 000168 00 WA 0 0 32 [25] .gnu_debuglink PROGBITS 00000000 0041d0 000008 00 0 0 1 [26] .shstrtab STRTAB 00000000 0041d8 0000d1 00 0 0 1
whew! what's all that? It's stuff you use for linking. Why is it there? because cat gets linked to things at runtime.
Note the .bss ... it's in the L part, the "Link" part.
What else is in there? Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000034 0x08048034 0x08048034 0x000e0 0x000e0 R E 0x4 INTERP 0x000114 0x08048114 0x08048114 0x00013 0x00013 R 0x1 [Requesting program interpreter: /lib/ld-linux.so.2] LOAD 0x000000 0x08048000 0x08048000 0x03cb4 0x03cb4 R E 0x1000 LOAD 0x004000 0x0804c000 0x0804c000 0x001d0 0x00348 RW 0x1000 DYNAMIC 0x004014 0x0804c014 0x0804c014 0x000d0 0x000d0 RW 0x4 NOTE 0x000128 0x08048128 0x08048128 0x00020 0x00020 R 0x4 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
program headers, which is stuff for when you run. These are images to load to memory.
Note that bss is not in the program headers ... and knowing about bss is pretty important at load time on an embedded system. Note that a.out actually got this right, but who needs a working 40-year-old design when we've got a broken 20-year-old design :-)
Other issues. We had compressed ELF. So I got to watch coreboot uncompress the elf *somewhere* and then deal with having to memcpy the uncompressed elf sections *somewhere else*. That's gross. I got a nice speedup when I got rid of ELF in v3 -- it was actually noticeable.
Finally ... people (me included) keep messing up ELF parsers. The GNU BFD tools don't quite get it right, in fact objdump used to coredump on some valid ELF files, which is why we have readelf. I know one large company that totally blew it and implemented a simulator which used section headers and not program headers. We had to modify the plan 9 linker to generate bogus ELF files for that system. The linuxbios ELF parser was always broken, and we didn't know it: it always assumed the program header section of elf came first, which it does *almost* all the time. But not always. It can come at the back. You have to read the entire ELF file to find out what's in the file. That's really bad if you have a streaming source of data, e.g. a serial line.
There is just so much wrong with ELF. I was happy to get it out of the ROM BIOS :-)
Good question, thanks for asking, we now return you the Great Debate :-)
ron
ELF fails on several fronts. Elf is for execution and linking. It mixes two kinds of things in one file. You would think, with all the junk they put in there, that it would get it right. As we can see, they didn't even get the 64/32 thing right, which is amazing.
You make it sound like it is hard to load an ELF file. It's not; it's about ten lines of code (half of that architecture-specific).
Also, you can strip the "linking" "junk" from an ELF file (get rid of all the section headers), and it will still load and run fine.
But let's suppose you have the ELF for /bin/cat: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[24] .bss NOBITS 0804c1e0 0041d0 000168 00 WA 0 0 32
whew! what's all that? It's stuff you use for linking. Why is it there? because cat gets linked to things at runtime.
Note the .bss ... it's in the L part, the "Link" part.
"L"? I don't see an "L"?
Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000034 0x08048034 0x08048034 0x000e0 0x000e0 R E 0x4 INTERP 0x000114 0x08048114 0x08048114 0x00013 0x00013 R 0x1 [Requesting program interpreter: /lib/ld-linux.so.2] LOAD 0x000000 0x08048000 0x08048000 0x03cb4 0x03cb4 R E 0x1000 LOAD 0x004000 0x0804c000 0x0804c000 0x001d0 0x00348 RW 0x1000 DYNAMIC 0x004014 0x0804c014 0x0804c014 0x000d0 0x000d0 RW 0x4 NOTE 0x000128 0x08048128 0x08048128 0x00020 0x00020 R 0x4 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
program headers, which is stuff for when you run. These are images to load to memory.
Some of those are, yes.
Note that bss is not in the program headers ...
Look more closely, at the second PT_LOAD segment: it has a mem_size bigger than load_size, and that is your bss (and some other zero- initialised data).
and knowing about bss is pretty important at load time on an embedded system.
For what?
Note that a.out actually got this right, but who needs a working 40-year-old design when we've got a broken 20-year-old design :-)
a.out isn't 2% as flexible as ELF is. It's fine when you don't need that flexibility, but you won't be happy when you do need it and you chose a.out ;-P
Other issues. We had compressed ELF. So I got to watch coreboot uncompress the elf *somewhere* and then deal with having to memcpy the uncompressed elf sections *somewhere else*. That's gross. I got a nice speedup when I got rid of ELF in v3 -- it was actually noticeable.
Yeah. This could be solved some other way, of course.
Finally ... people (me included) keep messing up ELF parsers.
Really :-)
The GNU BFD tools don't quite get it right, in fact objdump used to coredump on some valid ELF files, which is why we have readelf.
That's not the history behind readelf, but sure, objdump is way more fragile than readelf -- but that's not because it does ELF, quite the opposite: it tries to be everything for every binary format!
I know one large company that totally blew it and implemented a simulator which used section headers and not program headers.
I've recently seen something quite like that. Almost made my head explode.
We had to modify the plan 9 linker to generate bogus ELF files for that system. The linuxbios ELF parser was always broken, and we didn't know it: it always assumed the program header section of elf came first, which it does *almost* all the time. But not always. It can come at the back.
You can easily force it to be at the start. Either do it in the linker script, or post-process the ELF.
As far as I can see, SELF saves about 100 bytes over ELF, at the cost of a lot of flexibility (only one code/data/bss/notes section, to start with), and a bunch of important header data is missing, too (endianness, word size, architecture).
Segher
On 13/04/08 01:18 +0200, Segher Boessenkool wrote:
I'm not going to get into the ELF or not debate. Ron feels passionately about it, and we developed the SELF idea with that in mind. I personally don't care if we use ELF or not, but there are certain issues that we need to make sure are covered:
1) We cannot depend on the payload writer to do the right thing. All we can ask for is a ELF file. If there is post processing work that needs to be done, we need to do it ourselves.
2) The ELF loader needs to be simple, fast, small, and licensed with BSD so it can be included in libpayload
3) We need a equivalent solution for the NAME segment in SELF - the payload chooser must be able to get a human friendly name for each payload without decompressing the entire payload into memory.
Right now, SELF satisfies all these requirements.
As far as I can see, SELF saves about 100 bytes over ELF, at the cost of a lot of flexibility (only one code/data/bss/notes section, to start with), and a bunch of important header data is missing, too (endianness, word size, architecture).
You can have as many code/data/bss/notes sections as you wish. The only requirement is that there only be one ENTRY segment, which is fine, because thats exactly how many entry points we have.
We already know what endianism, word size and architecture we're going to run on - SELF is not intended to be portable. It is constructed when the LAR is, and is married to that LAR and that LAR only. And before somebody says something, yes - this is not fool-proof. Somebody will no doubt manage to screw it up and get the wrong SELF on the wrong architecture. But I don't like over-architecting to protect fools. Worrying about specifying the architecture here is the computer science equivalent of the "Caution, coffee is hot" warning.
Jordan
On Sat, Apr 12, 2008 at 07:15:52PM -0600, Jordan Crouse wrote:
We already know what endianism, word size and architecture we're going to run on - SELF is not intended to be portable.
Fair enough. It's not a multi-format binary. (Maybe someone will support the fat files in Darwin at some point? :)
It is constructed when the LAR is,
Ok.
and is married to that LAR and that LAR only.
Mh, I don't know about this. LAR is an archiver; I will want to reuse the SELF.
Worrying about specifying the architecture here is the computer science equivalent of the "Caution, coffee is hot" warning.
I think of it more like a "coffee" label on a container with dark liquid. The liquid could also be a much less tasty toxin that I do not want to drink by accident. Especially if found in a chemistry lab.
I kind of like being able to reject SELF pre execution, if the contents is known to not be what I am looking for.
//Peter
On 13/04/08 03:34 +0200, Peter Stuge wrote:
On Sat, Apr 12, 2008 at 07:15:52PM -0600, Jordan Crouse wrote:
We already know what endianism, word size and architecture we're going to run on - SELF is not intended to be portable.
Fair enough. It's not a multi-format binary. (Maybe someone will support the fat files in Darwin at some point? :)
It is constructed when the LAR is,
Ok.
and is married to that LAR and that LAR only.
Mh, I don't know about this. LAR is an archiver; I will want to reuse the SELF.
LAR _used_ to be an archiver. That model fell apart some time ago. Now its more of an archiver inspired file format.
If extracting and reusing the bits is your goal, then neither SELF nor the current status quo are going to work out, and we need to go back to ELF post haste.
Jordan
On Sat, Apr 12, 2008 at 09:54:39PM -0600, Jordan Crouse wrote:
and is married to that LAR and that LAR only.
Mh, I don't know about this. LAR is an archiver;
I just realized I contradicted this in my last post, by wanting lar to also clean up binaries. :)
I will want to reuse the SELF.
If extracting and reusing the bits is your goal, then neither SELF nor the current status quo are going to work out, and we need to go back to ELF post haste.
I want to reuse within another LAR file. Sorry, should have been more clear. I don't want to go all the way back to original file.
//Peter
I have started and deleted several emails on this ELF issue.
I will leave it at this: we've had one problem after another with ldscripts, ELF creation, ELF parsing, and so on over the years. It is hard to explain to users that we have to work out a problem and find out that THIS version of ubuntu has THAT version of gcc which uses THIS version of gld which, when used with a particular option in an ldscript, interacts with a bug in our boot-time ELF parsing code and causes someone to be not able to boot. "but it's worked for five years ..."
What's different about our world vs. the linux world: if this stuff happens in the linux world recovery is pretty easy. In our world? Well, it can be very hard. We require a higher level of certainty, and that in turn (in my view) means a higher level of simplicity, and fewer options. And, while ELF seems simple, in practice, it doesn't seem to work out to be simple. Why? I don't know.
It gets really old to have to deal with this stuff every year, when my average time to forget all the previous issues is 9 weeks :-)
And it's even more fun when you do trivial things with gld, watch it dump core, go to the mailing list, and find out the problem has existed for four years and no one knows how to fix it, even though they've tried.
So I want our usage of ld to be dead simple, about what the usage is for /bin/date. I don't want to count on ld to place things at certain offsets in files. It may work this week, and it will probably fail next year. The past 8 years of experience says so.
My bss problem on v3 with filo came about when the bss was not properly place into a loadable segment. I have no remembrance of what combination of gld, rubber bands, paper clips, and bailing wire caused that problem. I don't even care any more. I just don't want it to happen again :-)
So, what can I say? I think everything Segher says is probably right, he's smarter than I am :-) That said, it's never worked out that well for me :-)
off to dinner. I am going to watch this discussion from afar, and stay away from the keyboard from now on :-) I think you guys will pick out the best options.
Have a good rest of weekend, everyone!
ron
I will leave it at this: we've had one problem after another with ldscripts, ELF creation, ELF parsing, and so on over the years.
SELF will only attack the "ELF parsing" part. My point is that you can solve this *within* the ELF framework as well. And my fears are that a new, not yet stable format will not help anyone.
It is hard to explain to users that we have to work out a problem and find out that THIS version of ubuntu has THAT version of gcc which uses THIS version of gld which, when used with a particular option in an ldscript, interacts with a bug in our boot-time ELF parsing code and causes someone to be not able to boot. "but it's worked for five years ..."
Yeah, you've had your fair share (or more than that!) of such problems, and so have I. The GNU binutils are used almost exclusively for hosted applications, which are a bit of a monoculture. This results in it being a bit under-tested for "more interesting" applications (like coreboot).
The four steps I use for making this problem manageable are:
0) Use a linker script, _always_;
1) Never rely on other defaults, either;
2) Keep it simple. It is better to do a few simple build steps than to do one big huge complicated ld invocation, esp. since those are hard to debug (or understand, even!).
3) Attack the variance in build system versions in your build scripts, not in the code, wherever possible.
What's different about our world vs. the linux world: if this stuff happens in the linux world recovery is pretty easy. In our world? Well, it can be very hard. We require a higher level of certainty, and that in turn (in my view) means a higher level of simplicity, and fewer options.
Yes.
And, while ELF seems simple, in practice, it doesn't seem to work out to be simple. Why? I don't know.
I wouldn't blame this on ELF, but on the compiler/binutils used. It already becomes a *lot* better when you build using a cross toolchain instead of a native hosted toolchain, but asking users to install one of those is a large inconvenience. Besides, it's easy to mess that up, as well.
And it's even more fun when you do trivial things with gld, watch it dump core, go to the mailing list, and find out the problem has existed for four years and no one knows how to fix it, even though they've tried.
Yeah.
So I want our usage of ld to be dead simple, about what the usage is for /bin/date.
Bad example, date (and esp. GNU date) is very complicated :-)
I don't want to count on ld to place things at certain offsets in files. It may work this week, and it will probably fail next year. The past 8 years of experience says so.
Yes. "Never rely on defaults, esp. if those aren't documented defaults".
My bss problem on v3 with filo came about when the bss was not properly place into a loadable segment.
Bad linker script. I know, easier to say than to fix :-)
I have no remembrance of what combination of gld, rubber bands, paper clips, and bailing wire caused that problem. I don't even care any more. I just don't want it to happen again :-)
I second that sentiment. It's not realistic, of course, but hey.
off to dinner. I am going to watch this discussion from afar, and stay away from the keyboard from now on :-) I think you guys will pick out the best options.
My current thinking is that since you will have an intermediate ELF file anyway, that you will transform into a SELF file (which has some nice properties), that it would be easier and way more future-proof to transform that intermediate ELF file into a final ELF file (FELF?) with those same nice properties (junk removed, PHDRs and notes at the start -- did I forget any?)
Have a good rest of weekend, everyone!
You too!
Segher
On Sun, Apr 13, 2008 at 04:39:24AM +0200, Segher Boessenkool wrote:
My current thinking is that since you will have an intermediate ELF file anyway, that you will transform into a SELF file (which has some nice properties), that it would be easier and way more future-proof to transform that intermediate ELF file into a final ELF file (FELF?) with those same nice properties (junk removed, PHDRs and notes at the start -- did I forget any?)
I had the same thought.
There is one rather big disadvantage however, that comes from *almost* supporting a given format. There is some value in having a new, explicit, file format even if it is just a subset of another, existing, file format.
The benefit is in usability and principle of least surprise. If coreboot supports loading certain ELF files, it will be surprising to (some) users if it does not support loading all ELF files.
For the same reason we may also want another name less similar to ELF, to clearly show that it is not an ELF file.
//Peter
My current thinking is that since you will have an intermediate ELF file anyway, that you will transform into a SELF file (which has some nice properties), that it would be easier and way more future-proof to transform that intermediate ELF file into a final ELF file (FELF?) with those same nice properties (junk removed, PHDRs and notes at the start -- did I forget any?)
I had the same thought.
There is one rather big disadvantage however, that comes from *almost* supporting a given format. There is some value in having a new, explicit, file format even if it is just a subset of another, existing, file format.
OTOH, the disadvantages of a new format are many. You'll have to write new tools for dealing with them. Those tools will have bugs. The format *itself* will have bugs/shortcomings for quite a while, too.
The benefit is in usability and principle of least surprise. If coreboot supports loading certain ELF files, it will be surprising to (some) users if it does not support loading all ELF files.
This can be dealt with in the coreboot build system easily enough: just make sure to run the post-processor on all payload files. That's what build scripts are for: to make building easy, to make it easy to build the right way, and to make it hard to build the wrong way.
The ELF loader in coreboot can easily detect this, too (it can check for a coreboot name segment, for example).
Segher
On Sun, 2008-04-13 at 05:05 +0200, Segher Boessenkool wrote:
This can be dealt with in the coreboot build system easily enough: just make sure to run the post-processor on all payload files. That's what build scripts are for: to make building easy, to make it easy to build the right way, and to make it hard to build the wrong way.
The ELF loader in coreboot can easily detect this, too (it can check for a coreboot name segment, for example).
I'm also reminded how many flashable appliances (dlink routers, olevia TVs) have a flash tool that's able to recognize their own images, verify them before flashing for integrity (md5/crc/length), verify them for compatibility "sir this binary is for your router, I'm not letting you wreck your TV".
Is ELF up to that?
Hello! I agree! Now the question, in this phrase, "I'm also reminded how many flashable appliances (dlink routers, olevia TVs) have a flash tool that's able to recognize their own images, verify them before flashing for integrity (md5/crc/length), verify them for compatibility "sir this binary is for your router, I'm not letting you wreck your TV".". Whose TV device is that one?
Now I am aware of one device who can display both live TV and play stored streaming video and it is Linux based, but don't think that's the one you mean. -- Gregg C Levine hansolofalcon@worldnet.att.net "The Force will be with you always." Obi-Wan Kenobi
-----Original Message----- From: coreboot-bounces@coreboot.org [mailto:coreboot-bounces@coreboot.org]
On
Behalf Of Jeremy Jackson Sent: Saturday, April 12, 2008 11:15 PM To: Coreboot Subject: Re: [coreboot] SELF/ELF/LAR - add some flashrom?
On Sun, 2008-04-13 at 05:05 +0200, Segher Boessenkool wrote:
This can be dealt with in the coreboot build system easily enough: just make sure to run the post-processor on all payload files. That's what build scripts are for: to make building easy, to make it easy to build the right way, and to make it hard to build the wrong way.
The ELF loader in coreboot can easily detect this, too (it can check for a coreboot name segment, for example).
I'm also reminded how many flashable appliances (dlink routers, olevia TVs) have a flash tool that's able to recognize their own images, verify them before flashing for integrity (md5/crc/length), verify them for compatibility "sir this binary is for your router, I'm not letting you wreck your TV".
Is ELF up to that?
-- Jeremy Jackson Coplanar Networks (519)489-4903 http://www.coplanar.net jerj@coplanar.net
-- coreboot mailing list coreboot@coreboot.org http://www.coreboot.org/mailman/listinfo/coreboot
On Sun, Apr 13, 2008 at 05:05:29AM +0200, Segher Boessenkool wrote:
OTOH, the disadvantages of a new format are many. You'll have to write new tools for dealing with them. Those tools will have bugs.
Hopefully not so many if the new format is simple. In theory. :p
The format *itself* will have bugs/shortcomings for quite a while, too.
Well, it will stabilize eventually.
If coreboot supports loading certain ELF files, it will be surprising to (some) users if it does not support loading all ELF files.
This can be dealt with in the coreboot build system easily enough: just make sure to run the post-processor on all payload files.
Unfortunately that isn't the only way payloads come to coreboot.
lar will eventually be used to replace only the payload in a flash chip.
lar can already be used to do so in a .rom file.
The ELF loader in coreboot can easily detect this, too (it can check for a coreboot name segment, for example).
Yes.
I actually don't care what internal representation we use for complex binaries in LARs, as long as the lar utility can deal with common ELF files, and maybe a few other binary formats too!
I don't want a separate preprocessor. We are reinventing mkelfImage.
//Peter
OTOH, the disadvantages of a new format are many. You'll have to write new tools for dealing with them. Those tools will have bugs.
Hopefully not so many if the new format is simple. In theory. :p
Sure, but you know that there *will* be bugs, even if not all that many.
The format *itself* will have bugs/shortcomings for quite a while, too.
Well, it will stabilize eventually.
Eventually.
If coreboot supports loading certain ELF files, it will be surprising to (some) users if it does not support loading all ELF files.
This can be dealt with in the coreboot build system easily enough: just make sure to run the post-processor on all payload files.
Unfortunately that isn't the only way payloads come to coreboot.
lar will eventually be used to replace only the payload in a flash chip.
lar can already be used to do so in a .rom file.
I consider LAR to be part of the build system. We can fight over terminology all day ;-P
Either way, you have the *exact same* problem here with SELF as you would have with an ELF-post-processed-to-ELF. The only difference between SELF and ELF here is that by using an appropriate linker script, the ELF case doesn't even need post-processing. Maybe libpayload can provide such a linker script, dunno.
I actually don't care what internal representation we use for complex binaries in LARs, as long as the lar utility can deal with common ELF files, and maybe a few other binary formats too!
LAR shouldn't care *at all*, it should simply be a file container, like a cpio or an ar or a tar archive. Trying to make it do more doesn't really provide any benefits, but it does create huge headaches. It's a maintenance nightmare. In short, it's a layering violation.
Btw, the reason LAR was invented, instead of using one of the standard archive formats, is that none of those standard formats provided the needed feature set. SELF on the other hand seems to have no real benefits over ELF (or pretty much any other existing binary format).
I don't want a separate preprocessor. We are reinventing mkelfImage.
Not at all: we are defining a standard linking script / standard startup objects / standard libraries (libpayload!) here, like any other execution environment has. This makes things *simpler*, not harder. Or do you want every payload to redo all this on its own, introducing the same bugs over and over?
I haven't looked at what libpayload provides today, so sorry if this suggestion is old hand: one way to tackle these "we want to build payloads outside of coreboot" issues is to provide a "corepayload" script that will be called like
corepayload compile xxx.c -o xxx.o corepayload link xxx.o yyy.o -o xyz
or similar. This script would automatically get all ELF (or SELF) trickery right, and all other build snafus for that matter.
Comments?
Segher
you know, I am realizing: we can debate this for the next 10 years. How about a Code Contest? People develop code and take it all the way into qemu, we beat on each other's ideas, and the best code wins. We can argue forever. Why don't we just put the code out there and see which one works best?
ron
On 13/04/08 20:54 -0700, ron minnich wrote:
you know, I am realizing: we can debate this for the next 10 years. How about a Code Contest? People develop code and take it all the way into qemu, we beat on each other's ideas, and the best code wins. We can argue forever. Why don't we just put the code out there and see which one works best?
Unfortunately, that would mean that half the work would be wasted. Code is never great on the first try, and we could spend months tweaking our respective implementations just to see them fly out the window.
We are all professionals - we can come to a consensus. Right now, we're trying to figure out if we can make ELF perform the way we need it to perform. If we can be satisfied that it can, then we move on. If we cannot be satisfied, then the core team will have to make a decision to move on to another alternative, and then we can debate the details of SELF or something similar. But I think we can afford a week or two to let this all go back and forth - we don't need to get v3 out the door next week.
These are critical ideas, and broad consensus is important. And I think we are on the right track.
Jordan
you know, I am realizing: we can debate this for the next 10 years. How about a Code Contest? People develop code and take it all the way into qemu, we beat on each other's ideas, and the best code wins. We can argue forever. Why don't we just put the code out there and see which one works best?
Unfortunately, that would mean that half the work would be wasted. Code is never great on the first try, and we could spend months tweaking our respective implementations just to see them fly out the window.
Code isn't interesting anyway; design / architecture is the problem here.
We are all professionals - we can come to a consensus. Right now, we're trying to figure out if we can make ELF perform the way we need it to perform. If we can be satisfied that it can, then we move on. If we cannot be satisfied, then the core team will have to make a decision to move on to another alternative, and then we can debate the details of SELF or something similar.
I agree.
So, what exactly *do* we need here, and *why*?
But I think we can afford a week or two to let this all go back and forth - we don't need to get v3 out the door next week.
These are critical ideas, and broad consensus is important. And I think we are on the right track.
Nice to hear :-)
Segher
Code isn't interesting anyway; design / architecture is the problem here.
Agreed.
We are all professionals - we can come to a consensus. Right now, we're trying to figure out if we can make ELF perform the way we need it to perform. If we can be satisfied that it can, then we move on. If we cannot be satisfied, then the core team will have to make a decision to move on to another alternative, and then we can debate the details of SELF or something similar.
I agree.
So, what exactly *do* we need here, and *why*?
I think a discussion of where we are and where we want to go would be helpful. Feel free to correct my status summary.
Right now: 1. LAR format supports inclusion of: a. bootblock 1. Can be copied out, but only re-used in same size ROM file b. ELF segments (multiple segments with one entry point, from an ELF) 1. Can't be extracted without losing entry point c. arbitrary blobs 2. Coreboot build process uses the ELF segment idea for a. initram, stage2
I understand that there is resistance against breaking up ELF files for payloads. Does this extend to initram and stage2? Is there another way for the build process?
I'm personally against defining a new file format. I think if we're already going to be parsing ELF in LAR, another format is just another place to introduce bugs.
On extraction: It seems to me that very few end users will want to extract payloads from a ROM. That seems like a task better suited for developers. End users would more likely insert an ELF.
Where we want to go:
I'm unclear on this, because I don't understand the resistance to ELF segments in the LAR.
Thanks, Myles
On Tue, Apr 15, 2008 at 10:14:47AM -0600, Myles Watson wrote:
On extraction: It seems to me that very few end users will want to extract payloads from a ROM. That seems like a task better suited for developers.
I think this will be much more popular than one would first believe.
End users would more likely insert an ELF.
Yes, definately, which is why I think it is important to not need an extra tool/step.
I'm unclear on this, because I don't understand the resistance to ELF segments in the LAR.
I guess because we've produced so inconsistent quality ELF files.
//Peter
-----Original Message----- From: coreboot-bounces@coreboot.org [mailto:coreboot-bounces@coreboot.org] On Behalf Of Peter Stuge Sent: Tuesday, April 15, 2008 2:50 PM To: coreboot@coreboot.org Subject: Re: [coreboot] SELF/ELF/LAR
On Tue, Apr 15, 2008 at 10:14:47AM -0600, Myles Watson wrote:
On extraction: It seems to me that very few end users will want to extract payloads from a ROM. That seems like a task better suited for developers.
I think this will be much more popular than one would first believe.
End users would more likely insert an ELF.
Yes, definately, which is why I think it is important to not need an extra tool/step.
I'm unclear on this, because I don't understand the resistance to ELF segments in the LAR.
I guess because we've produced so inconsistent quality ELF files.
That's where my confusion comes in. Bad ELF files which are parsed and added to the LAR cause failures at build time. Bad ELF files inserted without being parsed cause failures at boot time.
I prefer Build-time failures to Boot-time failures.
Myles
On 15/04/08 15:20 -0600, Myles Watson wrote:
-----Original Message----- From: coreboot-bounces@coreboot.org [mailto:coreboot-bounces@coreboot.org] On Behalf Of Peter Stuge Sent: Tuesday, April 15, 2008 2:50 PM To: coreboot@coreboot.org Subject: Re: [coreboot] SELF/ELF/LAR
On Tue, Apr 15, 2008 at 10:14:47AM -0600, Myles Watson wrote:
On extraction: It seems to me that very few end users will want to extract payloads from a ROM. That seems like a task better suited for developers.
I think this will be much more popular than one would first believe.
End users would more likely insert an ELF.
Yes, definitely, which is why I think it is important to not need an extra tool/step.
I'm unclear on this, because I don't understand the resistance to ELF segments in the LAR.
I guess because we've produced so inconsistent quality ELF files.
That's where my confusion comes in. Bad ELF files which are parsed and added to the LAR cause failures at build time. Bad ELF files inserted without being parsed cause failures at boot time.
I prefer Build-time failures to Boot-time failures.
There is no doubt that some parsing has to happen. It is highly unlikely that any arbitrary ELF will be optimized for coreboot. The question isn't that it will be parsed, but rather what will the end result of said parse be? The main problem I have with the current scheme is that when we move to three or four payloads, I think that managing the segments and walking the lar will become very costly.
I would rather see us go back to a single LAR file per payload. Thats not to say that LAR file wouldn't be pre-processed.
Jordan
That's where my confusion comes in. Bad ELF files which are parsed and added to the LAR cause failures at build time. Bad ELF files inserted without being parsed cause failures at boot time.
I prefer Build-time failures to Boot-time failures.
There is no doubt that some parsing has to happen. It is highly unlikely that any arbitrary ELF will be optimized for coreboot.
I'm not sure everyone agrees with that statement yet.
The question isn't that it will be parsed, but rather what will the end result of said parse be? The main problem I have with the current scheme is that when we move to three or four payloads, I think that managing the segments and walking the lar will become very costly.
I would rather see us go back to a single LAR file per payload. Thats not to say that LAR file wouldn't be pre-processed.
It's usually 3 segments per ELF, right? How many payloads are we expecting each ROM file to have? How much time are we talking about?
Myles
On 16/04/08 07:12 -0600, Myles Watson wrote:
That's where my confusion comes in. Bad ELF files which are parsed and added to the LAR cause failures at build time. Bad ELF files inserted without being parsed cause failures at boot time.
I prefer Build-time failures to Boot-time failures.
There is no doubt that some parsing has to happen. It is highly unlikely that any arbitrary ELF will be optimized for coreboot.
I'm not sure everyone agrees with that statement yet.
Okay, let me put it another way. Its not a guarantee. We might recommend that you use something like libpayload that has a clue, but it it by no means mandatory. So we have to account for the worst case scenario.
The question isn't that it will be parsed, but rather what will the end result of said parse be? The main problem I have with the current scheme is that when we move to three or four payloads, I think that managing the segments and walking the lar will become very costly.
I would rather see us go back to a single LAR file per payload. Thats not to say that LAR file wouldn't be pre-processed.
It's usually 3 segments per ELF, right? How many payloads are we expecting each ROM file to have? How much time are we talking about?
Again, there is no guarantee. If we embed the name into the .notes section, we have at least 4 segments - possibly even more depending on the complexity of the payload (honestly, most of our payloads today are pretty simple). I would expect that in a typical multiple payload scenario, we would would have at least three payloads - the chooser (bayou), and two payloads to choose from.
Jordan
-----Original Message----- From: Jordan Crouse [mailto:jordan.crouse@amd.com] Sent: Wednesday, April 16, 2008 9:55 AM To: Myles Watson Cc: coreboot@coreboot.org Subject: Re: SELF/ELF/LAR
On 16/04/08 07:12 -0600, Myles Watson wrote:
That's where my confusion comes in. Bad ELF files which are parsed
and
added to the LAR cause failures at build time. Bad ELF files
inserted
without being parsed cause failures at boot time.
I prefer Build-time failures to Boot-time failures.
There is no doubt that some parsing has to happen. It is highly
unlikely
that any arbitrary ELF will be optimized for coreboot.
I'm not sure everyone agrees with that statement yet.
Okay, let me put it another way. Its not a guarantee. We might recommend that you use something like libpayload that has a clue, but it it by no means mandatory. So we have to account for the worst case scenario.
The question isn't that it will be parsed, but rather what will the end result of said parse be? The main problem I have with the current scheme is
that
when we move to three or four payloads, I think that managing the segments and walking the lar will become very costly.
I would rather see us go back to a single LAR file per payload. Thats not to say that LAR file wouldn't be pre-processed.
It's usually 3 segments per ELF, right? How many payloads are we
expecting
each ROM file to have? How much time are we talking about?
Again, there is no guarantee. If we embed the name into the .notes section, we have at least 4 segments - possibly even more depending on the complexity of the payload (honestly, most of our payloads today are pretty simple). I would expect that in a typical multiple payload scenario, we would would have at least three payloads - the chooser (bayou), and two payloads to choose from.
Would an extra field in LAR that points to the end of that payload's segments, and a name field suffice? I guess that becomes a small wrapper quickly. What if we kept what we have and added a wrapper (containing a name and a size) at every path separator?
I like the current ability to compress payload segments based on the best compression for that segment. It seems like we lose that if we put ELF files (or a simple alternative) into the LAR without parsing them.
Myles
Am Mittwoch, den 16.04.2008, 10:02 -0600 schrieb Myles Watson:
I like the current ability to compress payload segments based on the best compression for that segment. It seems like we lose that if we put ELF files (or a simple alternative) into the LAR without parsing them.
The only reason for that so far was a replacement for the BSS section, which exists in ELF.
How about teaching LAR to have chunks, like: LAR header (file is 40000 bytes compressed, 160000 uncomp., 3 chunks) chunk header 1 (200 bytes, compression type 1) data (eg. ELF headers) chunk header 2 (39000 bytes, compression type 2) more data (most sections in ELF) chunk header 3 (800 bytes, compression type 3) some more data
chunk headers could be written only if there's more than one chunk, making this backward compatible.
It would be the duty of lar (the tool) to make sense of an ELF file to define a sensible chunk layout for it.
With our hypothetical manifest file for LAR file handling, that could even be saved (and created by third party tools, for other formats), but the important property of this is that extracting and readding the file doesn't kill important data (eg. load and entry points), while there's still a 1:1 relation between payload and files in lar, but more flexibility.
Regards, Patrick
On 16/04/08 18:16 +0200, Patrick Georgi wrote:
Am Mittwoch, den 16.04.2008, 10:02 -0600 schrieb Myles Watson:
I like the current ability to compress payload segments based on the best compression for that segment. It seems like we lose that if we put ELF files (or a simple alternative) into the LAR without parsing them.
The only reason for that so far was a replacement for the BSS section, which exists in ELF.
How about teaching LAR to have chunks, like: LAR header (file is 40000 bytes compressed, 160000 uncomp., 3 chunks) chunk header 1 (200 bytes, compression type 1) data (eg. ELF headers) chunk header 2 (39000 bytes, compression type 2) more data (most sections in ELF) chunk header 3 (800 bytes, compression type 3) some more data
chunk headers could be written only if there's more than one chunk, making this backward compatible.
It would be the duty of lar (the tool) to make sense of an ELF file to define a sensible chunk layout for it.
With our hypothetical manifest file for LAR file handling, that could even be saved (and created by third party tools, for other formats), but the important property of this is that extracting and readding the file doesn't kill important data (eg. load and entry points), while there's still a 1:1 relation between payload and files in lar, but more flexibility.
Is this that much different then SELF?
Jordan
Am Mittwoch, den 16.04.2008, 10:26 -0600 schrieb Jordan Crouse:
How about teaching LAR to have chunks, like: LAR header (file is 40000 bytes compressed, 160000 uncomp., 3 chunks) chunk header 1 (200 bytes, compression type 1) data (eg. ELF headers) chunk header 2 (39000 bytes, compression type 2) more data (most sections in ELF) chunk header 3 (800 bytes, compression type 3) some more data
chunk headers could be written only if there's more than one chunk, making this backward compatible.
Is this that much different then SELF?
The chunk headers are part of the LAR structure, not SELF (or ELF, if you use some of the spare bits for compression information)
Regards, Patrick
On 16/04/08 10:02 -0600, Myles Watson wrote:
-----Original Message----- From: Jordan Crouse [mailto:jordan.crouse@amd.com] Sent: Wednesday, April 16, 2008 9:55 AM To: Myles Watson Cc: coreboot@coreboot.org Subject: Re: SELF/ELF/LAR
On 16/04/08 07:12 -0600, Myles Watson wrote:
That's where my confusion comes in. Bad ELF files which are parsed
and
added to the LAR cause failures at build time. Bad ELF files
inserted
without being parsed cause failures at boot time.
I prefer Build-time failures to Boot-time failures.
There is no doubt that some parsing has to happen. It is highly
unlikely
that any arbitrary ELF will be optimized for coreboot.
I'm not sure everyone agrees with that statement yet.
Okay, let me put it another way. Its not a guarantee. We might recommend that you use something like libpayload that has a clue, but it it by no means mandatory. So we have to account for the worst case scenario.
The question isn't that it will be parsed, but rather what will the end result of said parse be? The main problem I have with the current scheme is
that
when we move to three or four payloads, I think that managing the segments and walking the lar will become very costly.
I would rather see us go back to a single LAR file per payload. Thats not to say that LAR file wouldn't be pre-processed.
It's usually 3 segments per ELF, right? How many payloads are we
expecting
each ROM file to have? How much time are we talking about?
Again, there is no guarantee. If we embed the name into the .notes section, we have at least 4 segments - possibly even more depending on the complexity of the payload (honestly, most of our payloads today are pretty simple). I would expect that in a typical multiple payload scenario, we would would have at least three payloads - the chooser (bayou), and two payloads to choose from.
Would an extra field in LAR that points to the end of that payload's segments, and a name field suffice? I guess that becomes a small wrapper quickly. What if we kept what we have and added a wrapper (containing a name and a size) at every path separator?
I like the current ability to compress payload segments based on the best compression for that segment. It seems like we lose that if we put ELF files (or a simple alternative) into the LAR without parsing them.
I totaly agree - and SELF reflects that behavior without the additional LAR segments. I'm not even sure if Segher is promoting just tossing it into the LAR without parsing, but I don't want to put words in his mouth.
Jordan
-----Original Message----- From: Jordan Crouse [mailto:jordan.crouse@amd.com] Sent: Wednesday, April 16, 2008 10:26 AM To: Myles Watson Cc: coreboot@coreboot.org Subject: Re: SELF/ELF/LAR
On 16/04/08 10:02 -0600, Myles Watson wrote:
-----Original Message----- From: Jordan Crouse [mailto:jordan.crouse@amd.com] Sent: Wednesday, April 16, 2008 9:55 AM To: Myles Watson Cc: coreboot@coreboot.org Subject: Re: SELF/ELF/LAR
On 16/04/08 07:12 -0600, Myles Watson wrote:
That's where my confusion comes in. Bad ELF files which are
parsed
and
added to the LAR cause failures at build time. Bad ELF files
inserted
without being parsed cause failures at boot time.
I prefer Build-time failures to Boot-time failures.
There is no doubt that some parsing has to happen. It is highly
unlikely
that any arbitrary ELF will be optimized for coreboot.
I'm not sure everyone agrees with that statement yet.
Okay, let me put it another way. Its not a guarantee. We might recommend that you use something like libpayload that has a clue, but it it by no means mandatory. So we have to account for the worst case scenario.
The question isn't that it will be parsed, but rather what will the end result
of
said parse be? The main problem I have with the current scheme is
that
when we move to three or four payloads, I think that managing the segments and walking the lar will become very costly.
I would rather see us go back to a single LAR file per payload.
Thats
not to say that LAR file wouldn't be pre-processed.
It's usually 3 segments per ELF, right? How many payloads are we
expecting
each ROM file to have? How much time are we talking about?
Again, there is no guarantee. If we embed the name into the .notes section, we have at least 4 segments - possibly even more depending on the complexity of the payload (honestly, most of our payloads today are pretty
simple).
I would expect that in a typical multiple payload scenario, we would
would
have at least three payloads - the chooser (bayou), and two payloads to choose from.
Would an extra field in LAR that points to the end of that payload's segments, and a name field suffice? I guess that becomes a small
wrapper
quickly. What if we kept what we have and added a wrapper (containing a name and a size) at every path separator?
I like the current ability to compress payload segments based on the
best
compression for that segment. It seems like we lose that if we put ELF files (or a simple alternative) into the LAR without parsing them.
I totaly agree - and SELF reflects that behavior without the additional LAR segments. I'm not even sure if Segher is promoting just tossing it into the LAR without parsing, but I don't want to put words in his mouth.
Here's my opinion based on the discussion so far:
I like the idea of SELF except as a file format. I think SELF should stay internal to LAR. With the information in SELF we can create a valid ELF file at extraction time. That way we only have one way to get a Payload into LAR (parsing an ELF) and one way out. Coreboot code only needs to know about the SELF format, and we have many fewer code paths to test and maintain.
My fear was that LAR was going to need to parse ELF and SELF, and so was Coreboot. I think at least one of those pathways would bit rot quickly.
Myles
On 16/04/08 11:36 -0600, Myles Watson wrote:
I totaly agree - and SELF reflects that behavior without the additional LAR segments. I'm not even sure if Segher is promoting just tossing it into the LAR without parsing, but I don't want to put words in his mouth.
Here's my opinion based on the discussion so far:
I like the idea of SELF except as a file format. I think SELF should stay internal to LAR. With the information in SELF we can create a valid ELF file at extraction time. That way we only have one way to get a Payload into LAR (parsing an ELF) and one way out. Coreboot code only needs to know about the SELF format, and we have many fewer code paths to test and maintain.
Yes, SELF was always intended to be internal to LAR. To be completely honest with you, I don't really see much value in extracting a LAR. LAR may have started as an archiver format, but it has long since lost nearly all those characteristics. I realize why extraction may exist - if I am handed an ambiguous blob from an unknown source, I may want to be able to pull out the individual parts, but my only concern is being able to do things correctly during boot.
So we'll make that rule for SELF - it will always been an internal format. it will never exist outside of the LAR. If we extract the file, then we'll construct an ELF from the internal format.
My fear was that LAR was going to need to parse ELF and SELF, and so was Coreboot. I think at least one of those pathways would bit rot quickly.
Yes - and that would also require us to have some SELF tools available in the toolchain - not an interesting prospect for us.
Jordan
-----Original Message----- From: Jordan Crouse [mailto:jordan.crouse@amd.com] Sent: Wednesday, April 16, 2008 11:54 AM To: Myles Watson Cc: coreboot@coreboot.org Subject: Re: SELF/ELF/LAR
On 16/04/08 11:36 -0600, Myles Watson wrote:
I totaly agree - and SELF reflects that behavior without the
additional
LAR segments. I'm not even sure if Segher is promoting just tossing
it
into the LAR without parsing, but I don't want to put words in his
mouth.
Here's my opinion based on the discussion so far:
I like the idea of SELF except as a file format. I think SELF should
stay
internal to LAR. With the information in SELF we can create a valid ELF file at extraction time. That way we only have one way to get a Payload into LAR (parsing an ELF) and one way out. Coreboot code only needs to
know
about the SELF format, and we have many fewer code paths to test and maintain.
Yes, SELF was always intended to be internal to LAR. To be completely honest with you, I don't really see much value in extracting a LAR. LAR may have started as an archiver format, but it has long since lost nearly all those characteristics. I realize why extraction may exist - if I am handed an ambiguous blob from an unknown source, I may want to be able to pull out the individual parts, but my only concern is being able to do things correctly during boot.
I think another interesting case is when something broke and you want to isolate it. It would be interesting to pull just the payload from a working ROM to try in a broken one or vice versa.
So we'll make that rule for SELF - it will always been an internal format. it will never exist outside of the LAR. If we extract the file, then we'll construct an ELF from the internal format.
My fear was that LAR was going to need to parse ELF and SELF, and so was Coreboot. I think at least one of those pathways would bit rot quickly.
Yes - and that would also require us to have some SELF tools available in the toolchain - not an interesting prospect for us.
All right! At least two people agree on something!
Thanks for your patience in explaining and iterating with me.
Myles
I like the current ability to compress payload segments based on the best compression for that segment. It seems like we lose that if we put ELF files (or a simple alternative) into the LAR without parsing them.
I totaly agree - and SELF reflects that behavior without the additional LAR segments. I'm not even sure if Segher is promoting just tossing it into the LAR without parsing, but I don't want to put words in his mouth.
I'm promoting putting a payload file into the LAR as one file, yes.
If you want the compression stuff to know about ELF (or whatever) segments, make the compression stuff know about it -- not the LAR.
Segher
-----Original Message----- From: Segher Boessenkool [mailto:segher@kernel.crashing.org] Sent: Wednesday, April 16, 2008 12:56 PM To: Jordan Crouse Cc: Myles Watson; coreboot@coreboot.org Subject: Re: [coreboot] SELF/ELF/LAR
I like the current ability to compress payload segments based on the best compression for that segment. It seems like we lose that if we put ELF files (or a simple alternative) into the LAR without parsing them.
I totaly agree - and SELF reflects that behavior without the additional LAR segments. I'm not even sure if Segher is promoting just tossing it into the LAR without parsing, but I don't want to put words in his mouth.
I'm promoting putting a payload file into the LAR as one file, yes.
If you want the compression stuff to know about ELF (or whatever) segments, make the compression stuff know about it -- not the LAR.
It seems natural to have the tool that populates the ROM image understand the same things that Coreboot understands. That means the format and compression. Maybe we should call it something else if it's not just an archiver.
What's the disadvantage to having the Coreboot ROM Image Program (CRIMP) understand the format and compression?
Myles
There is no doubt that some parsing has to happen. It is highly unlikely that any arbitrary ELF will be optimized for coreboot.
I'm not sure everyone agrees with that statement yet.
Not every ELF file *can* work at all, even -- an ELF executable states the address it wants to run at, let's hope there is some free memory there!
It's usually 3 segments per ELF, right?
The usual case is one or two segments: either one RWX, or one RX and one RW.
Every payload executable in LAR should be *one* file in LAR. Mixing details of the binary format into the archiver is ludicrous. Whether to use ELF or not is a separate issue.
Segher
On Tue, Apr 15, 2008 at 1:49 PM, Peter Stuge peter@stuge.se wrote:
I guess because we've produced so inconsistent quality ELF files.
In my case it boiled down to several things. 1. Watching the time lost as we decompressed ELF to memory, relocated things, then did a final memcpy of the segments to their real destination. That was a big one for me. 2. The times that people experienced a dead stop from coreboot with the useful message "bad ELF file" or whatever it was. That's a complete show-stopper IMHO. I realize we can fix this by parsing ELF at build time. But if we're going to parse ELF at build time, why not do a little extra work to put it into a loader-friendly format? 3. My weird problems with BSS (and segher is right, it *should* have worked. It just didn't. BSS is contained in the segments. I think some part of the "ELF relocation" stuff was losing information about the segments -- I was not able to see how). 4. The fact that ELF file formats can really inhibit streaming uncompress -- since in some cases you have program headers at the *end* of the file, so you have to decompress the whole file to find out about the file (and, yes, you can fix this with ld, and it probably will work right on many ld's. But what of the ones it fails on? I'm tired of debugging the GNU toolchain). And streaming decompress is really important. We want to be cache friendly, and that means we should try for one pass over the data. Like it or not, the original designers of ELF did not seem to anticipate compressed segments. (yes, we can make our own incompatible flags. How will those new flags break bfd tools?)
well there are others but let it suffice to say, from my point of view of almost 9 years on this project, that ELF has been a mixed blessing from the start.
ron
On Tue, Apr 15, 2008 at 12:14 PM, Myles Watson mylesgw@gmail.com wrote:
Code isn't interesting anyway; design / architecture is the problem here.
Agreed.
We are all professionals - we can come to a consensus. Right now, we're trying to figure out if we can make ELF perform the way we need it to perform. If we can be satisfied that it can, then we move on. If we cannot be satisfied, then the core team will have to make a decision to move on to another alternative, and then we can debate the details of SELF or something similar.
I agree.
So, what exactly *do* we need here, and *why*?
I think a discussion of where we are and where we want to go would be helpful. Feel free to correct my status summary.
Right now:
- LAR format supports inclusion of: a. bootblock 1. Can be copied out, but only re-used in same size ROM
file b. ELF segments (multiple segments with one entry point, from an ELF) 1. Can't be extracted without losing entry point
This is something I don't understand, and probably would need a deeper understanding of elf to understand, but maybe someone can explain it in a nutshell. LAR parses the elf, extracts and manipulates the segments, and in the process moves the entry point, right? But LAR has to have some sort of entry point. So why can't we re-extract those segments, in LAR's new order, and generate an elf header with LAR's entry point? So the elf file isn't a binary equal to the original, but it runs just like it?
The other option, in my simple mind, would be to have LAR work like a one-way archiver. If you want the LARchive's payload, LAR doesn't actually extract the payload, it hands you a LARchive of just the payload, which can then be added to another LARchive. Could that work?
Thanks, Corey
On Wed, Apr 16, 2008 at 4:41 AM, Corey Osgood corey.osgood@gmail.com wrote:
On Tue, Apr 15, 2008 at 12:14 PM, Myles Watson mylesgw@gmail.com wrote:
Code isn't interesting anyway; design / architecture is the problem here.
Agreed.
We are all professionals - we can come to a consensus. Right now, we're trying to figure out if we can make ELF perform the way we need it to perform. If we can be satisfied that it can, then we move on. If we cannot be satisfied, then the core team will have to make a decision to move on to another alternative, and then we can debate the details of SELF or something similar.
I agree.
So, what exactly *do* we need here, and *why*?
I think a discussion of where we are and where we want to go would be helpful. Feel free to correct my status summary.
Right now:
- LAR format supports inclusion of: a. bootblock 1. Can be copied out, but only re-used in same size ROM
file
b. ELF segments (multiple segments with one entry point, from an
ELF) 1. Can't be extracted without losing entry point
This is something I don't understand, and probably would need a deeper understanding of elf to understand, but maybe someone can explain it in a nutshell. LAR parses the elf, extracts and manipulates the segments, and in the process moves the entry point, right? But LAR has to have some sort of entry point. So why can't we re-extract those segments, in LAR's new order, and generate an elf header with LAR's entry point? So the elf file isn't a binary equal to the original, but it runs just like it?
The other option, in my simple mind, would be to have LAR work like a one-way archiver. If you want the LARchive's payload, LAR doesn't actually extract the payload, it hands you a LARchive of just the payload, which can then be added to another LARchive. Could that work?
This is only part of the problem. It would be easy to do, but there's not consensus about what format to use for the extracted file.
Myles
On Sun, 2008-04-13 at 04:54 +0200, Peter Stuge wrote:
On Sun, Apr 13, 2008 at 04:39:24AM +0200, Segher Boessenkool wrote:
My current thinking is that since you will have an intermediate ELF file anyway, that you will transform into a SELF file (which has some nice properties), that it would be easier and way more future-proof to transform that intermediate ELF file into a final ELF file (FELF?) with those same nice properties (junk removed, PHDRs and notes at the start -- did I forget any?)
I had the same thought.
There is one rather big disadvantage however, that comes from *almost* supporting a given format. There is some value in having a new, explicit, file format even if it is just a subset of another, existing, file format.
The benefit is in usability and principle of least surprise. If coreboot supports loading certain ELF files, it will be surprising to (some) users if it does not support loading all ELF files.
I'm not sure if this is the right thread but I need to get this out of my head before I forget (i've got about 30 seconds)
I was thinking about legacy removal, and how some payloads need things tweaked for 1980's PC, (filo legacy IDE ports?) and I was thinking that the payload could have a flag saying "hey i need old IDE ports" or something. Coreboot can either flip that feature on, or emit a message.
Other flags are possible, like I need amd64 or FPU or something else.
Doesn't ELF have some fields already that could be used for that?
For the same reason we may also want another name less similar to ELF, to clearly show that it is not an ELF file.
That marking could also say "Linuxbios i386 executable", addressing that issue.
I'm not going to get into the ELF or not debate.
FWIW, neither am I. I have no stake in this ELF vs. SELF "battle"; the only thing I'm trying to do here is to make sure the final decision is based on correct information.
- We cannot depend on the payload writer to do the right thing. All
we can ask for is a ELF file. If there is post processing work that needs to be done, we need to do it ourselves.
And the "post-processor" can either write SELF, or an ELF with restrictions (PHDRs have to come early, for example). It's about the same thing: you have to use the linker in the same way either way, to arrive at the orignal ELF file. And then you post-process it in such a way that it will be easier/nicer to load in the coreboot environment.
The main difference between the formats AFAICS is that ELF is an old, well-defined, well-understood, matured format, while SELF is a hacky new thing that looks really simple now. My fear is that it will turn out it needs some of the features of ELF later on, and it might not be feasible to put those in in a nice way anymore by then.
- The ELF loader needs to be simple, fast, small, and licensed with
BSD so it can be included in libpayload
Sure, it's just a tiny piece of code either way, so I don't see this issue as an issue.
- We need a equivalent solution for the NAME segment in SELF - the
payload chooser must be able to get a human friendly name for each payload without decompressing the entire payload into memory.
If you want this name to be part of the binary (and we can discuss endlessly whether that is a good idea or not -- let's just accept the answer is "yes" :-) ), for ELF, you would do this using a PT_NOTE normally. It's easy to make sure that note sits at the start of the file, too.
As far as I can see, SELF saves about 100 bytes over ELF, at the cost of a lot of flexibility (only one code/data/bss/notes section, to start with), and a bunch of important header data is missing, too (endianness, word size, architecture).
You can have as many code/data/bss/notes sections as you wish.
Oh, good.
The only requirement is that there only be one ENTRY segment, which is fine, because thats exactly how many entry points we have.
Would it not make sense to put the entry point info in the file header, not in a segment header, then?
We already know what endianism, word size and architecture we're going to run on - SELF is not intended to be portable.
It is nice to have a file header that describes the basic format of the file. You don't *have* to check it if you do not want to.
One important case where you actually want to *use* this info is when you have an architecture that can execute both 32-bit and 64-bit programs.
Segher
On 13/04/08 04:15 +0200, Segher Boessenkool wrote:
- We need a equivalent solution for the NAME segment in SELF - the
payload chooser must be able to get a human friendly name for each payload without decompressing the entire payload into memory.
If you want this name to be part of the binary (and we can discuss endlessly whether that is a good idea or not -- let's just accept the answer is "yes" :-) ), for ELF, you would do this using a PT_NOTE normally. It's easy to make sure that note sits at the start of the file, too.
But if it was part of the binary, then we would have to decompress at least part of the binary to find it, no?
The only requirement is that there only be one ENTRY segment, which is fine, because thats exactly how many entry points we have.
Would it not make sense to put the entry point info in the file header, not in a segment header, then?
Well - first of all, I think you are confused. There is no SELF file - you won't see a .self file by itself. It is an internal format, suitable only for inclusion in a LAR. This is part of the simplicity. So no file means no file header. But even if there was, I would argue that having the load address in the segment list isn't really hurting us.
It pulls double duty - it identifies the end of the segment list, and it gives us the load address, right when we need it, after we have finished loading the data and code.
We already know what endianism, word size and architecture we're going to run on - SELF is not intended to be portable.
It is nice to have a file header that describes the basic format of the file. You don't *have* to check it if you do not want to.
One important case where you actually want to *use* this info is when you have an architecture that can execute both 32-bit and 64-bit programs.
Again - not a file format, so no file header. And I would argue that if we ever reach the point where we legitmately think that we want to start a payload in 64 bit mode, that the architecture flag should probably be in the LAR header, and not in the SELF.
- We need a equivalent solution for the NAME segment in SELF - the
payload chooser must be able to get a human friendly name for each payload without decompressing the entire payload into memory.
If you want this name to be part of the binary (and we can discuss endlessly whether that is a good idea or not -- let's just accept the answer is "yes" :-) ), for ELF, you would do this using a PT_NOTE normally. It's easy to make sure that note sits at the start of the file, too.
But if it was part of the binary, then we would have to decompress at least part of the binary to find it, no?
That depends on if and how you compress it. If you don't compress the headers (like in the proposed SELF format, iirc), you obviously do not have to decompress those headers. The same can be done for any binary format.
Decompression is *cheap*, btw -- the slowest thing *by far* is accessing the ROM to get the compressed bytes. Obviously, it is stupid to decompress the whole file if you only want to see the header.
The only requirement is that there only be one ENTRY segment, which is fine, because thats exactly how many entry points we have.
Would it not make sense to put the entry point info in the file header, not in a segment header, then?
Well - first of all, I think you are confused. There is no SELF file - you won't see a .self file by itself. It is an internal format, suitable only for inclusion in a LAR.
Same difference.
This is part of the simplicity. So no file means no file header.
But you know very well what I mean, I hope.
But even if there was, I would argue that having the load address in the segment list isn't really hurting us.
Sure. I'm sneakily trying to show that SELF is really just the same as ELF in every way, only different.
It pulls double duty - it identifies the end of the segment list, and it gives us the load address, right when we need it, after we have finished loading the data and code.
While in general it is a Really Bad Idea(tm) to do these cute "double duty" things, I can't see how this would come and bite us later. Then again, I've never been great at predicting the future. "Just Don't Do It".
It is nice to have a file header that describes the basic format of the file. You don't *have* to check it if you do not want to.
One important case where you actually want to *use* this info is when you have an architecture that can execute both 32-bit and 64-bit programs.
Again - not a file format, so no file header.
Same difference.
And I would argue that if we ever reach the point where we legitmately think that we want to start a payload in 64 bit mode,
For AMD64, ideally coreboot would run in 64-bit mode a few hundred insns after cold boot, IMHO. Hey, I can dream :-)
On some other architectures you don't have any choice: on PowerPC for example, you *have* to run any supervisor/hypervisor mode code in 64-bit mode (on 64-bit capable processors). You might still want 32-bit payloads though (for (data-)cache footprint savings, for example, or to have a common payload binary between all systems).
Aaaaaanyway... It's useful information, the current format provides it, it would be a shame to lose it (and it would be even worse to have to add it back in, later).
that the architecture flag should probably be in the LAR header, and not in the SELF.
Now I am confused. Are you arguing that SELF is an integral part of LAR, or not?
Segher
Sure. I'm sneakily trying to show that SELF is really just the same as ELF in every way, only different.
Of course it is. All formats follow the general concept of "copy this here, copy that there, zero this and jump there". There is nothing new under the sun. We would be silly to pretend otherwise.
The question is not that ELF cannot be made to work - coreboot has been using it for much longer than I've been darkening this e-mail list. No, the task is finding the best solution to the problem at hand.
Yes, SELF is very much similar to ELF - and it was specifically designed to solve the main problems that were identified with ELF in this situation. It may or may not have done so, only time will tell. And certainly an ELF can be made to work like we need it to - but is it worth the effort?
SELF was not designed to kill ELF, or COFF or anything else. It was not designed to be the next great standard in the world of computing. It was designed to solve the specific problems that coreboot and other interested payload applications have.
Does it have flaws? Yes. Will it have bugs? You bet. The former we can help solve right here in our discussions, and the later will happen regardless of what we choose to do.
So, much as I like the debate, we need to wind ourselves down. I don't think anybody disagrees that the status-quo isn't the right solution. There are those that think ELF is the best alternative, and there are those who think that it is not. That is the first question that must be resolved - if then the community thinks that ELF is not the best solution, then we can continue with that line of thought and consider SELF or an alternative.
How best can we make a decision then? A vote, perhaps? I had wondered if we could do a discussion and "vote" on the wiki in grand Wikipedian style, but not everybody has an account, and I don't want to exclude anybody. I don't know if Stefan can turn on IP editing for individual pages or not. if that doesn't work, I guess we'll try to do it via email.
Either way, I think we should reset the decks and allow representatives from both sides make their arguments. I nominate Segher for the "ELF is best" and Ron for the "ELF is not best" arguments. Remember, this is not over the merits of SELF, but rather, if we should abandon ELF at all.
Thoughts? Jordan
Jordan Crouse wrote:
How best can we make a decision then? A vote, perhaps? I had wondered if we could do a discussion and "vote" on the wiki in grand Wikipedian style, but not everybody has an account, and I don't want to exclude anybody. I don't know if Stefan can turn on IP editing for individual pages or not. if that doesn't work, I guess we'll try to do it via email.
Either way, I think we should reset the decks and allow representatives from both sides make their arguments. I nominate Segher for the "ELF is best" and Ron for the "ELF is not best" arguments. Remember, this is not over the merits of SELF, but rather, if we should abandon ELF at all.
Thoughts?
Hi,
Sorry to "butt in" on the discussion here...
Maybe I'm misunderstanding something here but why not put up a comparison chart (strengths/weaknesses) of ELF/SELF on the wiki (or somewhere)? I don't know if it can be easily done, since I'm not fully grasping the details, but if it is then it should give you an overview of their individual technical merits to vote on instead of voting based on "belief" (that's how I perceive the situation).
I'll crawl back under my rock now...
Best regards
Peter K
Sure. I'm sneakily trying to show that SELF is really just the same as ELF in every way, only different.
Of course it is. All formats follow the general concept of "copy this here, copy that there, zero this and jump there".
Yeah, pretty much so.
The question is not that ELF cannot be made to work - coreboot has been using it for much longer than I've been darkening this e-mail list. No, the task is finding the best solution to the problem at hand.
Yes. And there are many dimensions to "what is best".
Yes, SELF is very much similar to ELF - and it was specifically designed to solve the main problems that were identified with ELF in this situation. It may or may not have done so, only time will tell. And certainly an ELF can be made to work like we need it to - but is it worth the effort?
IMHO, your turning things upside down here. ELF is _known_ to work here, it is used by pretty much every other firmware loader out there!
SELF was not designed to kill ELF, or COFF or anything else. It was not designed to be the next great standard in the world of computing. It was designed to solve the specific problems that coreboot and other interested payload applications have.
Yes, I know this. But good to see it stated explicitly.
Does it have flaws? Yes. Will it have bugs? You bet. The former we can help solve right here in our discussions, and the later will happen regardless of what we choose to do.
Yeppers.
So, much as I like the debate, we need to wind ourselves down. I don't think anybody disagrees that the status-quo isn't the right solution. There are those that think ELF is the best alternative, and there are those who think that it is not. That is the first question that must be resolved - if then the community thinks that ELF is not the best solution, then we can continue with that line of thought and consider SELF or an alternative.
How best can we make a decision then? A vote, perhaps?
No. Let's keep this technical, please; and good engineering demands looking at the problem from all relevant angles, and then we reach consensus. If a deadlock happens, Ron gets to decide, that's his job as the daddy of this project :-)
Voting is for people who like arguing better than they like having an actual solution for their problems. I hope we're above that <g>.
Either way, I think we should reset the decks and allow representatives from both sides make their arguments. I nominate Segher for the "ELF is best" and Ron for the "ELF is not best" arguments. Remember, this is not over the merits of SELF, but rather, if we should abandon ELF at all.
Right. So my base is *not* "ELF is best", but it is "ELF is just fine for this, so why would we need another format, with all infrastructure overhead that requires?". So please start by explaining what is bad, or even just sub-optimal, in using ELF here.
Segher
On Mon, Apr 14, 2008 at 12:09:07AM +0200, Segher Boessenkool wrote:
So my base is *not* "ELF is best", but it is "ELF is just fine for this, so why would we need another format, with all infrastructure overhead that requires?". So please start by explaining what is bad, or even just sub-optimal, in using ELF here.
This is my take;
The problem in the past has been that ELFs have gone into flash (.rom) with structures that failed at run time in part because of the coreboot ELF loader and in part because of the somewhat constrained environment.
All files going into a .rom will go through lar, so I want lar to do any and all processing that is neccessary. It does not have to be internal code, it can be a library, or even an external helper, but my major concern is not needing to run any extra commands.
I expect to be able to put vmlinux straight into flash (.rom file) using lar. I specifically do not want a separate ELF chewer. That was mkelfImage. I don't hate mkelfImage in any way but I want to avoid that extra step in lar.
I also expect to be able to extract said vmlinux from one flash (.rom file), store it on disk (but not use it) and later write it into another flash (.rom file).
ELF pros: Much code that can be reused
Cons: Complex, code we've used so far has not been robust enough (Maybe we were just shooting our SELF in the foot by not being explicit enough about how lar member ELFs should be formatted?)
Regardless of what internal format we decide on for lar members, I do wish lar to automatically reformat vmlinux and other ELFs as they are inserted into lars. I envision lar plugins for all sorts of binaries.
//Peter
-----Original Message----- From: coreboot-bounces+mylesgw=gmail.com@coreboot.org [mailto:coreboot- bounces+mylesgw=gmail.com@coreboot.org] On Behalf Of Peter Stuge Sent: Sunday, April 13, 2008 4:46 PM To: coreboot@coreboot.org Subject: Re: [coreboot] ELF
On Mon, Apr 14, 2008 at 12:09:07AM +0200, Segher Boessenkool wrote:
So my base is *not* "ELF is best", but it is "ELF is just fine for this, so why would we need another format, with all infrastructure overhead that requires?". So please start by explaining what is bad, or even just sub-optimal, in using ELF here.
This is my take;
The problem in the past has been that ELFs have gone into flash (.rom) with structures that failed at run time in part because of the coreboot ELF loader and in part because of the somewhat constrained environment.
All files going into a .rom will go through lar, so I want lar to do any and all processing that is neccessary. It does not have to be internal code, it can be a library, or even an external helper, but my major concern is not needing to run any extra commands.
I expect to be able to put vmlinux straight into flash (.rom file) using lar. I specifically do not want a separate ELF chewer. That was mkelfImage. I don't hate mkelfImage in any way but I want to avoid that extra step in lar.
I also expect to be able to extract said vmlinux from one flash (.rom file), store it on disk (but not use it) and later write it into another flash (.rom file).
Do you need the intermediate storage step? Is there an advantage over copying from one .rom file to another?
Myles
On Mon, Apr 14, 2008 at 09:21:35AM -0600, Myles Watson wrote:
I also expect to be able to extract said vmlinux from one flash (.rom file), store it on disk (but not use it) and later write it into another flash (.rom file).
Do you need the intermediate storage step?
It would be nice.
Is there an advantage over copying from one .rom file to another?
It can be transferred between systems.
Yes, lar needs to be present whenever it is injected, but it would be nice to not have to copy the entire rom if I just want a payload.
//Peter
On Mon, 14 Apr 2008, Myles Watson wrote:
-----Original Message----- From: coreboot-bounces+mylesgw=gmail.com@coreboot.org [mailto:coreboot- bounces+mylesgw=gmail.com@coreboot.org] On Behalf Of Peter Stuge Sent: Sunday, April 13, 2008 4:46 PM To: coreboot@coreboot.org Subject: Re: [coreboot] ELF
On Mon, Apr 14, 2008 at 12:09:07AM +0200, Segher Boessenkool wrote:
So my base is *not* "ELF is best", but it is "ELF is just fine for this, so why would we need another format, with all infrastructure overhead that requires?". So please start by explaining what is bad, or even just sub-optimal, in using ELF here.
This is my take;
The problem in the past has been that ELFs have gone into flash (.rom) with structures that failed at run time in part because of the coreboot ELF loader and in part because of the somewhat constrained environment.
All files going into a .rom will go through lar, so I want lar to do any and all processing that is neccessary. It does not have to be internal code, it can be a library, or even an external helper, but my major concern is not needing to run any extra commands.
I expect to be able to put vmlinux straight into flash (.rom file) using lar. I specifically do not want a separate ELF chewer. That was mkelfImage. I don't hate mkelfImage in any way but I want to avoid that extra step in lar.
I also expect to be able to extract said vmlinux from one flash (.rom file), store it on disk (but not use it) and later write it into another flash (.rom file).
Do you need the intermediate storage step? Is there an advantage over copying from one .rom file to another?
Myles
If there's going to be a chooser, I can easily imagine end users may want to add or replace modules on a machine. They may know nothing of C, makefiles, or development at all, just click here to install.
||||| |||| ||||||||||||| ||| by Linux Labs International, Inc. Steven James, CTO
866 824 9737 support
On 14/04/08 00:09 +0200, Segher Boessenkool wrote:
Right. So my base is *not* "ELF is best", but it is "ELF is just fine for this, so why would we need another format, with all infrastructure overhead that requires?". So please start by explaining what is bad, or even just sub-optimal, in using ELF here.
Most of the arguments for why we don't use ELF today are not mine, and I wouldn't want to mis-state them.
I worry about how the payload chooser will enumerate and load the other payloads. As far as the name / notes issue is concerned, I'm not clear on how we can keep the headers uncompressed and be able to access individual segments in the ELF without eventually having to uncompress the whole thing.
I am also concerned about the extra step when we decompress to memory and then copy into place. How can we get around these issues and still call the result a true ELF file?
Jordan
Am Sonntag, den 13.04.2008, 17:06 -0600 schrieb Jordan Crouse:
I am also concerned about the extra step when we decompress to memory and then copy into place. How can we get around these issues and still call the result a true ELF file?
By enforcing the order of headers to be sensible by whatever tool puts the image into the rom. This could be lar, or probably even a wrapper around lar and other tools such as an "ELF sanitizer", so we don't have to stuff all that into lar.
That sanitizer would make sure that: 1. only necessary sections exist (those that are loaded + notes) 2. they're in the right order for streaming to succeed 3. the notes section is as far at the beginning as possible The streaming elf parser would need the capability to stop after finding a given section (eg. .notes/name)
This is very likely more expensive than having a non-compressed section in the file somewhere, but probably not very much. (this could be benchmarked)
If that wrapper-around-lar (or lar with all that built-in) is the default interface to build firmware images, it's hard to push an invalid (as in "too complex ELF") file into rom.
Alternatively, move the name into the lar filesystem: payload/0/chooser payload/1/tint - the stand-alone tetris clone payload/2/GRUB Invaders ...
the number is the order in which they are displayed or loaded, depending on the behaviour of payload/0.
Regards, Patrick Georgi
On Mon, Apr 14, 2008 at 07:51:15AM +0200, Patrick Georgi wrote:
Alternatively, move the name into the lar filesystem: payload/0/chooser payload/1/tint - the stand-alone tetris clone payload/2/GRUB Invaders
the number is the order in which they are displayed or loaded, depending on the behaviour of payload/0.
Neat! chooser must know lar anyway.
I'd like to avoid the sorting subdir however. Sorting can be accomplished just as easily by implementations in chooser:
* larball file order (use lar to reorder files) * filename (with option to strip leading [0-9_] characters) * file size * execution frequency (needs counter in flash) ...
//Peter
On 14/04/08 23:42 +0200, Peter Stuge wrote:
On Mon, Apr 14, 2008 at 07:51:15AM +0200, Patrick Georgi wrote:
Alternatively, move the name into the lar filesystem: payload/0/chooser payload/1/tint - the stand-alone tetris clone payload/2/GRUB Invaders
the number is the order in which they are displayed or loaded, depending on the behaviour of payload/0.
Neat! chooser must know lar anyway.
The only problem is that makes extracting a LAR pretty complex - you'll have some pretty funky names hanging around. It will also make a typical LAR command line look very ugly.
there is a good reason to have small "filenames" in the LAR and a descriptive name somewhere else.
I'd like to avoid the sorting subdir however. Sorting can be accomplished just as easily by implementations in chooser:
Oh, yeah - the chooser will have to figure out sorting on its own. For the chooser menu, thats easy, we have numerous things (alphabetical, lar order, etc, etc). For chaining it is somewhat more complex. I haven't thought of a good way to do that yet, but its probably going to involve a configuration "file" of some sort in the LAR.
Jordan
I worry about how the payload chooser will enumerate and load the other payloads.
Enumerate? It can just ask LAR what files are there. Put all payloads in some subdir or whatever, low-tech is good ;-P
As far as the name / notes issue is concerned, I'm not clear on how we can keep the headers uncompressed
You can keep the headers uncompressed by, well, keeping the headers uncompressed! This has nothing to do with the binary format used, but everything to do with how you use your compressors.
Why do you need the headers uncompressed, anyway? Any sane compression library allows you to ask "give me the first nnn bytes".
and be able to access individual segments in the ELF without eventually having to uncompress the whole thing.
Why do you want to load individual segments? When would this be useful?
I am also concerned about the extra step when we decompress to memory and then copy into place. How can we get around these issues and still call the result a true ELF file?
Again, that is more a feature of your compression than a feature of the binary format.
Why is an extra copying step harmful, btw? Not because of the few milliseconds that copying costs, even on the lowest-end hardware.
Segher
On Sun, Apr 13, 2008 at 3:09 PM, Segher Boessenkool segher@kernel.crashing.org wrote:
Right. So my base is *not* "ELF is best", but it is "ELF is just fine for this, so why would we need another format, with all infrastructure overhead that requires?". So please start by explaining what is bad, or even just sub-optimal, in using ELF here.
Let's take it one thing at a time.
First question.
What is the means by which I can create an ELF file such that the segment data is compressed, but the segment headers are not, and I can know which compression algorithm was used, so that I can do a streaming decompress of a segment into memory, a.k.a zero-copy?
ron
Right. So my base is *not* "ELF is best", but it is "ELF is just fine for this, so why would we need another format, with all infrastructure overhead that requires?". So please start by explaining what is bad, or even just sub-optimal, in using ELF here.
Let's take it one thing at a time.
Ok, that will help :-)
First question.
What is the means by which I can create an ELF file such that the segment data is compressed, but the segment headers are not, and I can know which compression algorithm was used, so that I can do a streaming decompress of a segment into memory, a.k.a zero-copy?
Let me start by firing back another question: why do you want this?
If it turns out you really *do* want this, there is always the PF_MASK_OS field in the p_flags field per segment (an 8-bit field that any OS can use any way it wants -- and coreboot is an OS as far as ELF is concerned).
Segher
On Mon, Apr 14, 2008 at 5:48 PM, Segher Boessenkool segher@kernel.crashing.org wrote:
What is the means by which I can create an ELF file such that the segment data is compressed, but the segment headers are not, and I can know which compression algorithm was used, so that I can do a streaming decompress of a segment into memory, a.k.a zero-copy?
Let me start by firing back another question: why do you want this?
why would I not? I want to have descriptors, uncompressed, for data, compressed, so that I can uncompress the data in a streaming mode to the right place in memory.
ron