On 19.02.2008 16:07, Stefan Reinauer wrote:
- Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net [080216 22:12]:
LAR headers do not save information as to code vs. data, etc.
Our choices are:
- continue with ELF support (though I do not intend to ever use it :-)
But the ELF support outside the LAR utility should be protected by ifdef to make it easy to compile it out.
Yes. Let's go both ways for now. Let's make it possible to keep ELF in there for the "safe case" and let's make it possible to drop ELF completely as a path for the near future where we do all preparsing at build time.
Myles? Could you prepare a patch which #ifdefs the code instead of removing it?
- extend the LAR headers to incorporate more ELF info
This is a wrong direction, in my opinion. ELF is ELF and LAR is LAR. The direction should rather be to drop some parts out of LAR again. LAR should be an _archiver_ and not try to do much extra magic.
Maybe we want extra magic done, in an extra tool, or in an extra switch of the lar utility. But not in the lar format. As Carl-Daniel rightly said, lar has become a very fragile part, and I think we need to thin it out again.
About the thinning out: Maybe... I'd prefer to keep the current LAR parsing code as is because the design was good enough to solve unanticipated problems.
I'd like a list of essential pieces of information we lose when parsing ELF into LAR.
These two are relevant for being able to use the file, and they are lost while unpacking:
u64 entry; u64 loadaddress;
That's an unpacking problem, not a packing problem and does not require any LAR header format changes. But I agree entry and loadaddress should be preserved on unpacking.
- when we pre-parse ELF files, save the ELF program headers when we
create the LAR file (except .bss is a section header, ELF is really not a very good design in many ways)
I don't think we should necessarily stick to ELF as a format here. Nor to LAR as a format.
Lar is an archive format, not an executable format.
What we need is a way to store an executable in a LAR so it can be
- executed in place; or
- unpacked to RAM in a streaming way, ie. no seeking.; and
- be able to compress it.
Fully agreed. The streaming unpacking is also a thing which requires a lot more scratch space for LZMA. That's not a LAR problem in itself, but anyone considering streaming unpacking needs to keep that in mind.
- Just make LAR itself be an ELF file. I mean, they're very similar anyway.
Nooooooooooooooooooooooooooooooooooooo!
Well, to put it into words, this would sacrifice the advantages we designed into the LAR format. This is an interesting idea, but confusing after all hasty aim at getting rid of all ELF.
Indeed.
What do we need to reconstruct an ELF file which is equivalent to the original as far as an ELF loader is concerned?
I would even ask an easier question. What do we need to make sure a payload in a lar file can be unpacked and packed again without loss of information?
That would be good enough for starters
- Entry point. No problem, is saved in LAR.
Really doesnt belong in lar. Needs to be dropped.
Where should we store it instead? Inside the compressed member? That would add another layer of indirection.
- Architecture. Right now, this is constant.
A lar has an architecture at the moment it gets a bootblock. Until then it can be considered "architecture agnostic"
Great.
- Sections. .bss is easy. It's the entry with compression algo "zeroes" .data is solvable. We have to either make the "data" section
"segment1" by convention or introduce another marker. .text and .rodata are merged by LAR. Reversing that is neither easy nor does it make sense. Both are readonly and unless you want .rodata marked noexec, stuffing them together into .text is a very workable solution.
So, since we can not do a good job recreating an ELF, we should clean up our design and do the following:
Drop entry and load address out of the LAR format. Instead, we define a "coreboot native blob format" which looks like the following and pack it _into_ the lar.
+--------------------+ | MAGIC | +--------------------+ | Entry | +--------------------+ | Number of Segments | | (consecutive) | +--------------------+ | SEGMENT 1/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+ | .... | +--------------------+ | SEGMENT n/n | | +----------------+ | | | Load Address | | | +----------------+ | | | Len (needed?) | | | +----------------+ | | | Compress. Type | | | +----------------+ | | | Data | | | | | | | +~~~~~~~~~~~~~~~~+ | +--------------------+
It looks nice at the first glance, but I still do not really like it. Call it a bad gut feeling.
Such a file could be unpacked and repacked, and a single ELF file always stays a single file, and is not falsely split into several files.
This has several advantages:
we don't have an entry point field in lar that is bogus in most lar entries. Instead we have one per "blob", which is logically the right thing to do
The format is well defined and non-lossy.
LAR as a format stays blob-agnostic. Enhancements to the way a blob is saved in a lar does not break the lar file format, but at most the blob file format.
It's more solid
It takes complexity out of the LAR complex. Which is good!
It only moves complexity and the interdependencies are really nasty.
- It's more modular.
I really think that not parsing ELF during coreboot run-time is the way to go. But we should not misuse LAR as an archive format for things and features that it was not made for. We buy us (hidden) complexity otherwise.
Dropping ELF is a a new idea. And as such it deserves a new concept that is well-thought, consistent and simple in its own.
Agreed. I still would like to discuss the new proposed lar format further, though. Especially the aspect of LAR header versioning because the new header would be in no way compatible to the old one.
Regards, Carl-Daniel