I'd like to use this thread to collect TODO items for LAR.
My current list is rather short, but that does not mean it is easy to complete these tasks.
- "Unparse" parsed ELF so you get an ELF back when unpacking a LAR containing parsed ELF file. Reason: See mails on the list for the last few days. - "List holes" mode. Reason: To be able to "zerofill" the LAR, we must know where to fill. - Ability to specify where files should end up in the LAR (beginning of free space, end of free space, fixed address). Reason: Some machines can only access part of the flash directly, so shadowing/copying is needed for the inaccessible parts. That can only be done after RAM init, thereby requiring normal/initram and fallback/initram to end up in the directly accessible window.
Regards, Carl-Daniel
A slightly simpler idea. When we unpack a LAR, we could unpack to a simpler format than ELF.
a.out maybe.
ron
On 16.02.2008 17:25, ron minnich wrote:
A slightly simpler idea. When we unpack a LAR, we could unpack to a simpler format than ELF.
a.out maybe.
If you want to add an a.out parser to LAR, that would be OK. My only concern is that unpacking and subsequent repacking/parsing loses no information.
Regards, Carl-Daniel
On Feb 16, 2008 8:38 AM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
If you want to add an a.out parser to LAR, that would be OK. My only concern is that unpacking and subsequent repacking/parsing loses no information.
Best bet would be a simple ELF header generator, I've done this in Plan 9, it's a whopping 15 lines of code.
ron
On 16.02.2008 17:50, ron minnich wrote:
On Feb 16, 2008 8:38 AM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
If you want to add an a.out parser to LAR, that would be OK. My only concern is that unpacking and subsequent repacking/parsing loses no information.
Best bet would be a simple ELF header generator, I've done this in Plan 9, it's a whopping 15 lines of code.
Cool. Since you wrote the code, importing it into lar should give us no license problems. Can you post a patch? It should be reasonably easy tofind out which segment is bss (the one with compalgo "zeroes"), but I'm not so sure about data and code. Do we want any convention (code is always segment0, data is always segment1)?
Regards, Carl-Daniel
On Feb 16, 2008 10:08 AM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
It should be reasonably easy tofind out which segment is bss (the one with compalgo "zeroes"), but I'm not so sure about data and code. Do we want any convention (code is always segment0, data is always segment1)?
and here is Stefan's point. Converting from ELF to LAR is not information-preserving. LAR headers have less information. The pre-parsing loses information that is in the ELF.
LAR headers do not save information as to code vs. data, etc.
Our choices are: 1. continue with ELF support (though I do not intend to ever use it :-) 2. extend the LAR headers to incorporate more ELF info 3. when we pre-parse ELF files, save the ELF program headers when we create the LAR file (except .bss is a section header, ELF is really not a very good design in many ways) 4. Make the LAR headers ELF headers 5. Just make LAR itself be an ELF file. I mean, they're very similar anyway.
So, I think we back out Myle's patch, until we resolve this. Stefan has a point. I still dislike ELF parsing in coreboot. The v2 code was not actually able to handle all ELF files; it had obscure failure modes. The v3 code is even more restrictive.
Unless we want to greatly grow the ELF code in coreboot, some preprocessing is inevitable.
So, let's have the discussion!
ron
On 16.02.2008 19:22, ron minnich wrote:
On Feb 16, 2008 10:08 AM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
It should be reasonably easy tofind out which segment is bss (the one with compalgo "zeroes"), but I'm not so sure about data and code. Do we want any convention (code is always segment0, data is always segment1)?
and here is Stefan's point. Converting from ELF to LAR is not information-preserving. LAR headers have less information. The pre-parsing loses information that is in the ELF.
LAR headers do not save information as to code vs. data, etc.
Our choices are:
- continue with ELF support (though I do not intend to ever use it :-)
But the ELF support outside the LAR utility should be protected by ifdef to make it easy to compile it out.
- extend the LAR headers to incorporate more ELF info
I'd like a list of essential pieces of information we lose when parsing ELF into LAR.
- when we pre-parse ELF files, save the ELF program headers when we
create the LAR file (except .bss is a section header, ELF is really not a very good design in many ways)
I doubt the ELF header alone is good enough to convert between the two formats.
- Make the LAR headers ELF headers
You're not serious, right?
- Just make LAR itself be an ELF file. I mean, they're very similar anyway.
Nooooooooooooooooooooooooooooooooooooo!
So, I think we back out Myle's patch, until we resolve this. Stefan
The patch has never been committed. No backing out needed. Repeat: The patch has never been committed.
has a point. I still dislike ELF parsing in coreboot. The v2 code was not actually able to handle all ELF files; it had obscure failure modes. The v3 code is even more restrictive.
Unless we want to greatly grow the ELF code in coreboot, some preprocessing is inevitable.
What do we need to reconstruct an ELF file which is equivalent to the original as far as an ELF loader is concerned? - Entry point. No problem, is saved in LAR. - Architecture. Right now, this is constant. - Sections. .bss is easy. It's the entry with compression algo "zeroes" .data is solvable. We have to either make the "data" section "segment1" by convention or introduce another marker. .text and .rodata are merged by LAR. Reversing that is neither easy nor does it make sense. Both are readonly and unless you want .rodata marked noexec, stuffing them together into .text is a very workable solution.
Did I miss anything essential?
Regards, Carl-Daniel
On Feb 16, 2008 1:12 PM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
But the ELF support outside the LAR utility should be protected by ifdef to make it easy to compile it out.
yes. I'd like to be able to compile it out.
- extend the LAR headers to incorporate more ELF info
I'd like a list of essential pieces of information we lose when parsing ELF into LAR.
me too. But I can tell you two: section type and architecture.
- when we pre-parse ELF files, save the ELF program headers when we
create the LAR file (except .bss is a section header, ELF is really not a very good design in many ways)
I doubt the ELF header alone is good enough to convert between the two formats.
probably not.
- Make the LAR headers ELF headers
You're not serious, right?
I'm mainly trying to point out the extrema.
- Just make LAR itself be an ELF file. I mean, they're very similar anyway.
Nooooooooooooooooooooooooooooooooooooo!
Good. I am glad to see that reaction :-)
The patch has never been committed. No backing out needed. Repeat: The patch has never been committed.
I keep forgetting that.
What do we need to reconstruct an ELF file which is equivalent to the original as far as an ELF loader is concerned?
- Entry point. No problem, is saved in LAR.
- Architecture. Right now, this is constant.
- Sections. .bss is easy. It's the entry with compression algo "zeroes" .data is solvable. We have to either make the "data" section
"segment1" by convention or introduce another marker. .text and .rodata are merged by LAR. Reversing that is neither easy nor does it make sense. Both are readonly and unless you want .rodata marked noexec, stuffing them together into .text is a very workable solution.
Did I miss anything essential?
I'll leave it to the list :-)
ron
* Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net [080216 22:12]:
LAR headers do not save information as to code vs. data, etc.
Our choices are:
- continue with ELF support (though I do not intend to ever use it :-)
But the ELF support outside the LAR utility should be protected by ifdef to make it easy to compile it out.
Yes. Let's go both ways for now. Let's make it possible to keep ELF in there for the "safe case" and let's make it possible to drop ELF completely as a path for the near future where we do all preparsing at build time.
- extend the LAR headers to incorporate more ELF info
This is a wrong direction, in my opinion. ELF is ELF and LAR is LAR. The direction should rather be to drop some parts out of LAR again. LAR should be an _archiver_ and not try to do much extra magic.
Maybe we want extra magic done, in an extra tool, or in an extra switch of the lar utility. But not in the lar format. As Carl-Daniel rightly said, lar has become a very fragile part, and I think we need to thin it out again.
I'd like a list of essential pieces of information we lose when parsing ELF into LAR.
These two are relevant for being able to use the file, and they are lost while unpacking:
u64 entry; u64 loadaddress;
- when we pre-parse ELF files, save the ELF program headers when we
create the LAR file (except .bss is a section header, ELF is really not a very good design in many ways)
I don't think we should necessarily stick to ELF as a format here. Nor to LAR as a format.
Lar is an archive format, not an executable format.
What we need is a way to store an executable in a LAR so it can be 1. executed in place; or 2. unpacked to RAM in a streaming way, ie. no seeking.; and 3. be able to compress it.
- Just make LAR itself be an ELF file. I mean, they're very similar anyway.
Nooooooooooooooooooooooooooooooooooooo!
Well, to put it into words, this would sacrifice the advantages we designed into the LAR format. This is an interesting idea, but confusing after all hasty aim at getting rid of all ELF.
What do we need to reconstruct an ELF file which is equivalent to the original as far as an ELF loader is concerned?
I would even ask an easier question. What do we need to make sure a payload in a lar file can be unpacked and packed again without loss of information?
- Entry point. No problem, is saved in LAR.
Really doesnt belong in lar. Needs to be dropped.
- Architecture. Right now, this is constant.
A lar has an architecture at the moment it gets a bootblock. Until then it can be considered "architecture agnostic"
- Sections. .bss is easy. It's the entry with compression algo "zeroes" .data is solvable. We have to either make the "data" section
"segment1" by convention or introduce another marker. .text and .rodata are merged by LAR. Reversing that is neither easy nor does it make sense. Both are readonly and unless you want .rodata marked noexec, stuffing them together into .text is a very workable solution.
So, since we can not do a good job recreating an ELF, we should clean up our design and do the following:
Drop entry and load address out of the LAR format. Instead, we define a "coreboot native blob format" which looks like the following and pack it _into_ the lar.
+--------------------+ | MAGIC | +--------------------+ | Entry | +--------------------+ | Number of Segments | | (consecutive) | +--------------------+ | SEGMENT 1/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+ | .... | +--------------------+ | SEGMENT n/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+
Such a file could be unpacked and repacked, and a single ELF file always stays a single file, and is not falsely split into several files.
This has several advantages: - we don't have an entry point field in lar that is bogus in most lar entries. Instead we have one per "blob", which is logically the right thing to do
- The format is well defined and non-lossy.
- LAR as a format stays blob-agnostic. Enhancements to the way a blob is saved in a lar does not break the lar file format, but at most the blob file format.
- It's more solid
- It takes complexity out of the LAR complex. Which is good!
- It's more modular.
I really think that not parsing ELF during coreboot run-time is the way to go. But we should not misuse LAR as an archive format for things and features that it was not made for. We buy us (hidden) complexity otherwise.
Dropping ELF is a a new idea. And as such it deserves a new concept that is well-thought, consistent and simple in its own.
Regards, Stefan
Am Dienstag, den 19.02.2008, 16:07 +0100 schrieb Stefan Reinauer:
Drop entry and load address out of the LAR format. Instead, we define a "coreboot native blob format" which looks like the following and pack it _into_ the lar.
+--------------------+ | MAGIC | +--------------------+ | Entry | +--------------------+ | Number of Segments | | (consecutive) | +--------------------+ | SEGMENT 1/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+ | .... | +--------------------+ | SEGMENT n/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+
This gives essentially two layers of archive format, lar and executable-file.
How about doing segments per file (like right now), and simply prepending their data "section" with the load address (as they already get special treatment anyway).
The entry point could end up in a file "entry", 4 bytes large, containing nothing but the address to jump to.
I think, that approach would be even simpler, while still keeping the advantages of your proposal.
Regards, Patrick Georgi
Am Dienstag, den 19.02.2008, 16:48 +0100 schrieb Patrick Georgi:
How about doing segments per file (like right now), and simply prepending their data "section" with the load address (as they already get special treatment anyway).
Follow up to this issue: the first bytes would also be compressed (otherwise we have a special case again).
Currently the decoding routines only decode a whole file, but at least lzma is capable of stopping after a given number of output bytes. The only remaining issue is that the actual starting position for the decode is then 4 bytes below the intended loading position.
Just decoding the first 4 bytes into some scratch space probably won't work (easily) because the lz-family uses decompressed data as dictionary space..
This could be solved by ordering the segments by the lar tool to be highest-start-address-first, but this detail would have to be documented in a way that no-one will ever stumble over "weird 4 byte memory corruption" without quickly figuring out what's going on.
Comments?
Patrick Georgi
On 20.02.2008 20:17, Patrick Georgi wrote:
Am Dienstag, den 19.02.2008, 16:48 +0100 schrieb Patrick Georgi:
How about doing segments per file (like right now), and simply prepending their data "section" with the load address (as they already get special treatment anyway).
Follow up to this issue: the first bytes would also be compressed (otherwise we have a special case again).
Currently the decoding routines only decode a whole file, but at least lzma is capable of stopping after a given number of output bytes. The only remaining issue is that the actual starting position for the decode is then 4 bytes below the intended loading position.
Sorry, that's just horrible(TM). I'd expect that as a emergency band-aid hack if the header format was crap, but the header format makes such runtime hacks unnecessary.
Just decoding the first 4 bytes into some scratch space probably won't work (easily) because the lz-family uses decompressed data as dictionary space..
This could be solved by ordering the segments by the lar tool to be highest-start-address-first, but this detail would have to be documented in a way that no-one will ever stumble over "weird 4 byte memory corruption" without quickly figuring out what's going on.
This guarantees we will never be unemployed because people will hire us to understand random crashes in the code. Or they will move to another codebase. Being too clever is bad. Having hidden side effects is so badbadbad that I lack the words to describe it.
Please remember the problem we're trying to solve: Preserving entry point and load address over an unpack/repack cycle. That is a pure LAR userspace archiver problem and modifying any firmware code is not a path to a viable solution.
Regards, Carl-Daniel
On 19.02.2008 16:48, Patrick Georgi wrote:
Am Dienstag, den 19.02.2008, 16:07 +0100 schrieb Stefan Reinauer:
Drop entry and load address out of the LAR format. Instead, we define a "coreboot native blob format" which looks like the following and pack it _into_ the lar.
+--------------------+ | MAGIC | +--------------------+ | Entry | +--------------------+ | Number of Segments | | (consecutive) | +--------------------+ | SEGMENT 1/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+ | .... | +--------------------+ | SEGMENT n/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+
This gives essentially two layers of archive format, lar and executable-file.
I don't like the layered ide very much.
How about doing segments per file (like right now), and simply prepending their data "section" with the load address (as they already get special treatment anyway).
What does that buy us except a header format change which essentially moved the loadaddr header field to the end of the header?
The entry point could end up in a file "entry", 4 bytes large, containing nothing but the address to jump to.
Bad idea, sorry. That increases boot time by one LAR lookup which can be very costly in the worst case.
Regards, Carl-Daniel
On 19.02.2008 16:07, Stefan Reinauer wrote:
- Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net [080216 22:12]:
LAR headers do not save information as to code vs. data, etc.
Our choices are:
- continue with ELF support (though I do not intend to ever use it :-)
But the ELF support outside the LAR utility should be protected by ifdef to make it easy to compile it out.
Yes. Let's go both ways for now. Let's make it possible to keep ELF in there for the "safe case" and let's make it possible to drop ELF completely as a path for the near future where we do all preparsing at build time.
Myles? Could you prepare a patch which #ifdefs the code instead of removing it?
- extend the LAR headers to incorporate more ELF info
This is a wrong direction, in my opinion. ELF is ELF and LAR is LAR. The direction should rather be to drop some parts out of LAR again. LAR should be an _archiver_ and not try to do much extra magic.
Maybe we want extra magic done, in an extra tool, or in an extra switch of the lar utility. But not in the lar format. As Carl-Daniel rightly said, lar has become a very fragile part, and I think we need to thin it out again.
About the thinning out: Maybe... I'd prefer to keep the current LAR parsing code as is because the design was good enough to solve unanticipated problems.
I'd like a list of essential pieces of information we lose when parsing ELF into LAR.
These two are relevant for being able to use the file, and they are lost while unpacking:
u64 entry; u64 loadaddress;
That's an unpacking problem, not a packing problem and does not require any LAR header format changes. But I agree entry and loadaddress should be preserved on unpacking.
- when we pre-parse ELF files, save the ELF program headers when we
create the LAR file (except .bss is a section header, ELF is really not a very good design in many ways)
I don't think we should necessarily stick to ELF as a format here. Nor to LAR as a format.
Lar is an archive format, not an executable format.
What we need is a way to store an executable in a LAR so it can be
- executed in place; or
- unpacked to RAM in a streaming way, ie. no seeking.; and
- be able to compress it.
Fully agreed. The streaming unpacking is also a thing which requires a lot more scratch space for LZMA. That's not a LAR problem in itself, but anyone considering streaming unpacking needs to keep that in mind.
- Just make LAR itself be an ELF file. I mean, they're very similar anyway.
Nooooooooooooooooooooooooooooooooooooo!
Well, to put it into words, this would sacrifice the advantages we designed into the LAR format. This is an interesting idea, but confusing after all hasty aim at getting rid of all ELF.
Indeed.
What do we need to reconstruct an ELF file which is equivalent to the original as far as an ELF loader is concerned?
I would even ask an easier question. What do we need to make sure a payload in a lar file can be unpacked and packed again without loss of information?
That would be good enough for starters
- Entry point. No problem, is saved in LAR.
Really doesnt belong in lar. Needs to be dropped.
Where should we store it instead? Inside the compressed member? That would add another layer of indirection.
- Architecture. Right now, this is constant.
A lar has an architecture at the moment it gets a bootblock. Until then it can be considered "architecture agnostic"
Great.
- Sections. .bss is easy. It's the entry with compression algo "zeroes" .data is solvable. We have to either make the "data" section
"segment1" by convention or introduce another marker. .text and .rodata are merged by LAR. Reversing that is neither easy nor does it make sense. Both are readonly and unless you want .rodata marked noexec, stuffing them together into .text is a very workable solution.
So, since we can not do a good job recreating an ELF, we should clean up our design and do the following:
Drop entry and load address out of the LAR format. Instead, we define a "coreboot native blob format" which looks like the following and pack it _into_ the lar.
+--------------------+ | MAGIC | +--------------------+ | Entry | +--------------------+ | Number of Segments | | (consecutive) | +--------------------+ | SEGMENT 1/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+ | .... | +--------------------+ | SEGMENT n/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+
It looks nice at the first glance, but I still do not really like it. Call it a bad gut feeling.
Such a file could be unpacked and repacked, and a single ELF file always stays a single file, and is not falsely split into several files.
This has several advantages:
we don't have an entry point field in lar that is bogus in most lar entries. Instead we have one per "blob", which is logically the right thing to do
The format is well defined and non-lossy.
LAR as a format stays blob-agnostic. Enhancements to the way a blob is saved in a lar does not break the lar file format, but at most the blob file format.
It's more solid
It takes complexity out of the LAR complex. Which is good!
It only moves complexity and the interdependencies are really nasty.
- It's more modular.
I really think that not parsing ELF during coreboot run-time is the way to go. But we should not misuse LAR as an archive format for things and features that it was not made for. We buy us (hidden) complexity otherwise.
Dropping ELF is a a new idea. And as such it deserves a new concept that is well-thought, consistent and simple in its own.
Agreed. I still would like to discuss the new proposed lar format further, though. Especially the aspect of LAR header versioning because the new header would be in no way compatible to the old one.
Regards, Carl-Daniel
Carl-Daniel Hailfinger wrote:
These two are relevant for being able to use the file, and they are lost while unpacking:
u64 entry; u64 loadaddress;
That's an unpacking problem, not a packing problem and does not require any LAR header format changes. But I agree entry and loadaddress should be preserved on unpacking.
It's a design problem. We are trying to make an executable file format out of an archiver file format. It was never designed to be that. And there was no sane reason for splitting an ELF file up into many segments in the lar to begin with.
All I do is trying to back out bad design decisions.
So, since we can not do a good job recreating an ELF, we should clean up our design and do the following:
Drop entry and load address out of the LAR format. Instead, we define a "coreboot native blob format" which looks like the following and pack it _into_ the lar.
+--------------------+ | MAGIC | +--------------------+ | Entry | +--------------------+ | Number of Segments | | (consecutive) | +--------------------+ | SEGMENT 1/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+ | .... | +--------------------+ | SEGMENT n/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+
It looks nice at the first glance, but I still do not really like it. Call it a bad gut feeling.
Ok, so lets do it like this. It is the cleaner design. Reasons should win against bowel.
Such a file could be unpacked and repacked, and a single ELF file always stays a single file, and is not falsely split into several files.
This has several advantages:
we don't have an entry point field in lar that is bogus in most lar entries. Instead we have one per "blob", which is logically the right thing to do
The format is well defined and non-lossy.
LAR as a format stays blob-agnostic. Enhancements to the way a blob is saved in a lar does not break the lar file format, but at most the blob file format.
It's more solid
It takes complexity out of the LAR complex. Which is good!
It only moves complexity and the interdependencies are really nasty.
There are no interdependencies. And it moves the complexity where it belongs: Locally to the problem domain, not globally to the central data structure.
- It's more modular.
I really think that not parsing ELF during coreboot run-time is the way to go. But we should not misuse LAR as an archive format for things and features that it was not made for. We buy us (hidden) complexity otherwise.
Dropping ELF is a a new idea. And as such it deserves a new concept that is well-thought, consistent and simple in its own.
Agreed. I still would like to discuss the new proposed lar format further, though. Especially the aspect of LAR header versioning because the new header would be in no way compatible to the old one.
Ok, any reasons besides bad guts?
Versioning is another issue, and yes, now that we finally have real actual hardware running, we should think about versioned headers.
Keep in mind that in the current lar format versioned headers are completely bogus because you can not unpack a lar anyways.
Stefan
On 21.02.2008 00:06, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
These two are relevant for being able to use the file, and they are lost while unpacking:
u64 entry; u64 loadaddress;
That's an unpacking problem, not a packing problem and does not require any LAR header format changes. But I agree entry and loadaddress should be preserved on unpacking.
It's a design problem. We are trying to make an executable file format out of an archiver file format. It was never designed to be that.
We created an archive format designed to store executables. When did we say that LAR was never designed to store executable code?
And there was no sane reason for splitting an ELF file up into many segments in the lar to begin with.
Splitting an ELF file into many segments which are compressed separately makes sense because it removes the need of a decompression scratch space which has been a problem in v2.
All I do is trying to back out bad design decisions.
So, since we can not do a good job recreating an ELF, we should clean up our design and do the following:
Drop entry and load address out of the LAR format. Instead, we define a "coreboot native blob format" which looks like the following and pack it _into_ the lar.
+--------------------+ | MAGIC | +--------------------+ | Entry | +--------------------+ | Number of Segments | | (consecutive) | +--------------------+ | SEGMENT 1/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+ | .... | +--------------------+ | SEGMENT n/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+
It looks nice at the first glance, but I still do not really like it. Call it a bad gut feeling.
Ok, so lets do it like this. It is the cleaner design. Reasons should win against bowel.
The biggest reason not to do this is that it gives us zero improvements on the problem solving side. Remember the problem we were trying to solve? Reimporting that new sub-member type still is problematic.
Such a file could be unpacked and repacked, and a single ELF file always stays a single file, and is not falsely split into several files.
This has several advantages:
we don't have an entry point field in lar that is bogus in most lar entries. Instead we have one per "blob", which is logically the right thing to do
The format is well defined and non-lossy.
LAR as a format stays blob-agnostic. Enhancements to the way a
blob is saved in a lar does not break the lar file format, but at most the blob file format.
It's more solid
It takes complexity out of the LAR complex. Which is good!
It only moves complexity and the interdependencies are really nasty.
There are no interdependencies. And it moves the complexity where it belongs: Locally to the problem domain, not globally to the central data structure.
If you really want to have the complexity local to the problem domain, you need to move the entry point away from the top-level header because not all files added to the LAR have an entry point (think data files like the option table).
- It's more modular.
I really think that not parsing ELF during coreboot run-time is the way to go. But we should not misuse LAR as an archive format for things and features that it was not made for. We buy us (hidden) complexity otherwise.
Dropping ELF is a a new idea. And as such it deserves a new concept that is well-thought, consistent and simple in its own.
Agreed. I still would like to discuss the new proposed lar format further, though. Especially the aspect of LAR header versioning because the new header would be in no way compatible to the old one.
Ok, any reasons besides bad guts?
See above.
Versioning is another issue, and yes, now that we finally have real actual hardware running, we should think about versioned headers.
I'd say we first get a solid versioning infrastructure before we change the LAR format for other reasons.
Keep in mind that in the current lar format versioned headers are completely bogus because you can not unpack a lar anyways.
Due to missing compat code?
Regards, Carl-Daniel
-----Original Message----- From: Carl-Daniel Hailfinger [mailto:c-d.hailfinger.devel.2006@gmx.net] Sent: Wednesday, February 20, 2008 3:53 PM To: Stefan Reinauer Cc: ron minnich; Coreboot; Myles Watson Subject: Re: [coreboot] LAR TODO
On 19.02.2008 16:07, Stefan Reinauer wrote:
- Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net [080216
22:12]:
LAR headers do not save information as to code vs. data, etc.
Our choices are:
- continue with ELF support (though I do not intend to ever use it :-
)
But the ELF support outside the LAR utility should be protected by
ifdef
to make it easy to compile it out.
Yes. Let's go both ways for now. Let's make it possible to keep ELF in there for the "safe case" and let's make it possible to drop ELF completely as a path for the near future where we do all preparsing at build time.
Myles? Could you prepare a patch which #ifdefs the code instead of removing it?
It's non-trivial. I think it will complicate the build process, Kconfig, and code quite a bit. I think the right thing to do is add whatever support to lar is needed, then when everyone feels comfortable, use something like my last patch.
Myles
On 21.02.2008 16:55, Myles Watson wrote:
Carl-Daniel Hailfinger wrote on Wednesday, February 20, 2008 3:53 PM
On 19.02.2008 16:07, Stefan Reinauer wrote:
- Carl-Daniel Hailfinger [20080216 22:12]:
ifdef to make it easy to compile it out.
Yes. Let's go both ways for now. Let's make it possible to keep ELF in there for the "safe case" and let's make it possible to drop ELF completely as a path for the near future where we do all preparsing at build time.
Myles? Could you prepare a patch which #ifdefs the code instead of removing it?
It's non-trivial. I think it will complicate the build process, Kconfig, and code quite a bit. I think the right thing to do is add whatever support to lar is needed, then when everyone feels comfortable, use something like my last patch.
If it is non-trivial, it is still doable. It would reduce bootblock size quite a bit and help us figure out hidden dependencies. You managed to remove ELF support completely from the firmware part of coreboot, so it should be possible to convert that removal to #ifdef.
Regards, Carl-Daniel
Myles? Could you prepare a patch which #ifdefs the code instead of removing it?
This patch is a hopefully less controversial version of a previous patch which removed the ELF loader from coreboot v3. This adds a Kconfig option PAYLOAD_ELF_LOADER which builds the loader into v3. In order to make it a little safer, I changed PAYLOAD_PREPARSE_ELF to PAYLOAD_NO_PREPARSE_ELF and made that option depend on PAYLOAD_ELF_LOADER so that no one adds an unparsed ELF without the loader.
One part that was strange to me was that I first tried adding elfboot.o and archelfboot.o to the beginning of the list of object files. This broke coreboot. It still finished building, but would not boot on QEMU. I was surprised that it broke it, but didn't investigate further.
Myles
Signed-off-by: Myles Watson mylesgw@gmail.com
On 11.03.2008 21:52, Myles Watson wrote:
Myles? Could you prepare a patch which #ifdefs the code instead of removing it?
This patch is a hopefully less controversial version of a previous patch which removed the ELF loader from coreboot v3. This adds a Kconfig option PAYLOAD_ELF_LOADER which builds the loader into v3. In order to make it a little safer, I changed PAYLOAD_PREPARSE_ELF to PAYLOAD_NO_PREPARSE_ELF and made that option depend on PAYLOAD_ELF_LOADER so that no one adds an unparsed ELF without the loader.
One part that was strange to me was that I first tried adding elfboot.o and archelfboot.o to the beginning of the list of object files. This broke coreboot. It still finished building, but would not boot on QEMU. I was surprised that it broke it, but didn't investigate further.
Did you make distclean in between? v3 dependency handling is something between screwed and nonexistent. It does work sometimes, though.
Myles
Signed-off-by: Myles Watson mylesgw@gmail.com
Index: Kconfig
--- Kconfig (revision 638) +++ Kconfig (working copy) @@ -92,6 +92,12 @@
menu "Payload"
+config PAYLOAD_ELF_LOADER
- bool "Include ELF payload loader"
- default n
- help
This option allows an unparsed ELF paylaod to be added and loaded.
choice prompt "Payload type" default PAYLOAD_NONE @@ -125,10 +131,10 @@ help The path and filename of the ELF executable file to use as payload.
-config PAYLOAD_PREPARSE_ELF
- bool "Pre-parse ELF file and convert ELF segments to LAR entries"
- depends PAYLOAD_ELF
- default y
+config PAYLOAD_NO_PREPARSE_ELF
- bool "Add ELF without parsing and converting to LAR entries"
- depends PAYLOAD_ELF && PAYLOAD_ELF_LOADER
- default n help Until now, coreboot has used ELF for the payload. There are many problems with this, not least being the inefficiency -- the ELF has
@@ -142,7 +148,7 @@ flashed the FLASH and rebooted the machine. Boot time is really not the time you want to find out your ELF payload is broken.
With this option, coreboot will direct lar to break each ELF
segment into a LAR entry. ELF will not be used at all. Note that (for now) coreboot is not backward compatible -- if you put an ELF payload in, coreboot can not parse it. We hope to remove ELFWithout this option, coreboot will direct lar to break each ELF
Index: arch/x86/stage1.c
--- arch/x86/stage1.c (revision 638) +++ arch/x86/stage1.c (working copy) @@ -28,10 +28,12 @@ #include <mc146818rtc.h> #include <cpu.h>
+#ifdef CONFIG_PAYLOAD_ELF_LOADER /* ah, well, what a mess! This is a hard code. FIX ME but how?
- By getting rid of ELF ...
*/ #define UNCOMPRESS_AREA (0x400000) +#endif /* CONFIG_PAYLOAD_ELF_LOADER */
/* these prototypes should go into headers */ void uart_init(void); @@ -86,6 +88,8 @@ return; }
+#ifdef CONFIG_PAYLOAD_ELF_LOADER /* until we get rid of elf */ int legacy(struct mem_file *archive, char *name, void *where, struct lb_memory *mem) { @@ -101,6 +105,7 @@ printk(BIOS_ERR, "elfboot_mem returns %d\n", ret); return -1; } +#endif /* CONFIG_PAYLOAD_ELF_LOADER */
/*
- This function is called from assembler code with its argument on the
@@ -110,7 +115,9 @@ { int ret; struct mem_file archive, result; +#ifdef CONFIG_PAYLOAD_ELF_LOADER int elfboot_mem(struct lb_memory *mem, void *where, int size); +#endif /* CONFIG_PAYLOAD_ELF_LOADER */ void *entry;
/* we can't statically init this hack. */ @@ -202,9 +209,11 @@
printk(BIOS_DEBUG, "Stage2 code done.\n");
+#ifdef CONFIG_PAYLOAD_ELF_LOADER ret = find_file(&archive, "normal/payload", &result); if (! ret) legacy(&archive, "normal/payload", (void *)UNCOMPRESS_AREA, mem); +#endif /* CONFIG_PAYLOAD_ELF_LOADER */
entry = load_file_segments(&archive, "normal/payload"); if (entry != (void*)-1) {
OK so far.
Index: arch/x86/Makefile
--- arch/x86/Makefile (revision 638) +++ arch/x86/Makefile (working copy) @@ -115,9 +115,20 @@ # initram module and the various stages and payload files. #
-STAGE0_LIB_OBJ = uart8250.o mem.o elfboot.o lar.o delay.o vtxprintf.o \ +STAGE0_LIB_OBJ = uart8250.o mem.o +STAGE0_ARCH_X86_OBJ = stage1.o serial.o
+ifeq ($(CONFIG_PAYLOAD_ELF_LOADER),y) +STAGE0_LIB_OBJ += elfboot.o +STAGE0_ARCH_X86_OBJ += archelfboot.o +else +STAGE0_LIB_OBJ += +STAGE0_ARCH_X86_OBJ += +endif
+STAGE0_LIB_OBJ += lar.o delay.o vtxprintf.o \ vsprintf.o console.o string.o $(DECOMPRESSORS) -STAGE0_ARCH_X86_OBJ = stage1.o serial.o archelfboot.o speaker.o \ +STAGE0_ARCH_X86_OBJ += speaker.o \ udelay_io.o mc146818rtc.o post_code.o
ifeq ($(CONFIG_CPU_I586),y)
Sorry, but the chunk above is a real mess. Can't you move the conditional chunk to the end and avoid the else path completely?
@@ -132,10 +143,10 @@ endif endif
-ifeq ($(CONFIG_PAYLOAD_PREPARSE_ELF), y) +ifeq ($(CONFIG_PAYLOAD_NO_PREPARSE_ELF), y)
- PARSEELF =
+else PARSEELF = -e -else
- PARSEELF =
endif
STAGE0_OBJ := $(patsubst %,$(obj)/lib/%,$(STAGE0_LIB_OBJ)) \
The patch would get my Ack except for the Makefile chunk I complained about. Please rework that.
Regards, Carl-Daniel
Myles? Could you prepare a patch which #ifdefs the code instead of removing it?
One part that was strange to me was that I first tried adding elfboot.o
and
archelfboot.o to the beginning of the list of object files. This broke coreboot. It still finished building, but would not boot on QEMU. I
was
surprised that it broke it, but didn't investigate further.
Did you make distclean in between? v3 dependency handling is something between screwed and nonexistent. It does work sometimes, though.
Yes, I did. It didn't work.
Sorry, but the chunk above is a real mess. Can't you move the conditional chunk to the end and avoid the else path completely?
Yep, this is the ugly place I was talking about. I had tried to add them to the beginning of the list, and it wouldn't build. I tried at the end this time and it worked. I still think it's a little strange that the order mattered.
The patch would get my Ack except for the Makefile chunk I complained about. Please rework that.
See attached.
Thanks,
Myles
Signed-off-by: Myles Watson mylesgw@gmail.com
On 13.03.2008 03:45, Myles Watson wrote:
Myles? Could you prepare a patch which #ifdefs the code instead of removing it?
One part that was strange to me was that I first tried adding elfboot.o
and
archelfboot.o to the beginning of the list of object files. This broke coreboot. It still finished building, but would not boot on QEMU. I
was
surprised that it broke it, but didn't investigate further.
Did you make distclean in between? v3 dependency handling is something between screwed and nonexistent. It does work sometimes, though.
Yes, I did. It didn't work.
Sorry, but the chunk above is a real mess. Can't you move the conditional chunk to the end and avoid the else path completely?
Yep, this is the ugly place I was talking about. I had tried to add them to the beginning of the list, and it wouldn't build. I tried at the end this time and it worked. I still think it's a little strange that the order mattered.
The patch would get my Ack except for the Makefile chunk I complained about. Please rework that.
See attached.
Thanks,
Myles
Signed-off-by: Myles Watson mylesgw@gmail.com
Acked-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Regards, Carl-Daniel
Signed-off-by: Myles Watson mylesgw@gmail.com
Acked-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Rev 640.
Thanks, Myles
ron minnich wrote:
A slightly simpler idea. When we unpack a LAR, we could unpack to a simpler format than ELF.
Thinking longer about this, if we detect a directory with all files named "segmentXX" in it, we could indeed just unpack all those into a new (sub)lar. This is similar to the idea of copying lar from one file to the other. Or we join all the segmentXX files back into one single file with entry point, segments etc. Then we could drop that information out of lar, making lar itself less complex.
On Feb 16, 2008 10:21 AM, Stefan Reinauer stepan@coresystems.de wrote:
ron minnich wrote:
A slightly simpler idea. When we unpack a LAR, we could unpack to a simpler format than ELF.
Thinking longer about this, if we detect a directory with all files named "segmentXX" in it, we could indeed just unpack all those into a new (sub)lar. This is similar to the idea of copying lar from one file to the other. Or we join all the segmentXX files back into one single file with entry point, segments etc. Then we could drop that information out of lar, making lar itself less complex.
OK, this can work too. Stefan, you have an interesting viewpoint due to your work with IBM and Cell issues. I want to make sure we meet your needs :-)
I'm gonna walk the dog, you guys fix it whlie I'm gone :-)
I fear I'm adding more noise than information to this conversation.
ron
On 16.02.2008 19:24, ron minnich wrote:
On Feb 16, 2008 10:21 AM, Stefan Reinauer stepan@coresystems.de wrote:
ron minnich wrote:
A slightly simpler idea. When we unpack a LAR, we could unpack to a simpler format than ELF.
Thinking longer about this, if we detect a directory with all files named "segmentXX" in it, we could indeed just unpack all those into a new (sub)lar. This is similar to the idea of copying lar from one file to the other. Or we join all the segmentXX files back into one single file with entry point, segments etc. Then we could drop that information out of lar, making lar itself less complex.
OK, this can work too. Stefan, you have an interesting viewpoint due to your work with IBM and Cell issues. I want to make sure we meet your needs :-)
As long as we know the exact needs, we can compare them to what v3 provides right now and fix the loose ends.
Stefan, please elaborate on your requirements. Do you want the ability to unpack and repack without information loss or should unpacking generate an ELF file which is bootable by other loaders as well?
Regards, Carl-Daniel
On 16.02.2008 19:21, Stefan Reinauer wrote:
ron minnich wrote:
A slightly simpler idea. When we unpack a LAR, we could unpack to a simpler format than ELF.
Thinking longer about this, if we detect a directory with all files named "segmentXX" in it, we could indeed just unpack all those into a new (sub)lar. This is similar to the idea of copying lar from one file to the other. Or we join all the segmentXX files back into one single file with entry point, segments etc. Then we could drop that information out of lar, making lar itself less complex.
The strength of LAR is that no intermediate decompression during runtime is needed. I want to preserve that strength because it saves us from hacks where we have to guess a reasonable are where we can decompress to. That temporary decompression area must not overlap with the areas the segments of the ELF should be loaded to.
I hope the analysis in my other mail in this thread helps us to focus on the details of the problem before we attempt a solution.
Regards, Carl-Daniel