Hi Daniel,
I think this is a good idea. Alas, as I hear for the first time about it, I lack any context of prior discussions / context. So bear with me, if I ask things that have already been answered.
On 14.11.20 00:52, Daniel Kiper wrote:
The goal is to pass all logs produced by various boot components to the running OS. The OS kernel should expose these logs to the user space and/or process them internally if needed. The content of these logs should be human readable. However, they should also contain the information which allows admins to do e.g. boot time analysis.
The log specification should be as much as possible platform agnostic and self contained. The final version of this spec should be merged into existing specifications, e.g. UEFI, ACPI, Multiboot2, or be a standalone spec, e.g. as a part of OASIS Standards. The former seems better but is not perfect too...
Here is the description (pseudocode) of the structures which will be used to store the log data.
I guess using C syntax for your "pseudocode" isn't a good choice as it can confuse people and might lead to (unportable) implementations that try to copy this definition to C. IMHO, it's much better for a specifi- cation to provide exact bit/byte offsets. The protocol tool [P], for instance, can be used to draw things in ASCII. A portable C implemen- tation could then use these offsets for proper (de)serialization with- out structs that try to mimic the representation in memory.
The members of struct bf_log:
- version: the firmware and bootloader log format version number, 1 for now,
- producer: the producer/firmware/bootloader/... type; the length allows ASCII UUID storage if somebody needs that functionality,
So, is this always supposed to be a string?
- flags: it can be used to store information about log state, e.g. it was truncated or not (does it make sense to have an information about the number of lost messages?),
Truncation is an interesting point as I see no length for the available space specified. I assume most implementations would want a field for this. Otherwise they would have to track it separately.
In coreboot, we use a ring-buffer for messages as it seems more useful to keep the most recent messages, it's also extended across reboots and suspend/resume cycles. For this, it would need an additional pointer where the oldest message resides, iow. where to start reading messages.
- next_bf_log_addr: address of next bf_log struct; none if zero
Do I understand this correctly that a later-stage boot component would use this to add its own `bf_log` to the chain? e.g. if I start initia- lizing hardware with coreboot and then use GRUB2 to boot, each of them would set up its own ` bf_log` and GRUB2 would set this pointer if possible?
(I think newer spec versions should not change anything in first 5 bf_log members; this way older log parsers will be able to traverse/copy all logs regardless of version used in one log or another),
Good point, which brings me to another good practice regarding such data formats: A length field for the header. In this case the length from the start of `bf_log` to the start of `msgs`. This would give us backwards compatibility in case additional fields are added in the future. And would also allow the various implementation to add custom fields (not for communication with log parser but for their own use).
- next_msg_off: the offset, in bytes, from the beginning of the bf_log struct, of the next byte after the last log message in the msgs[]; i.e. the offset of the next available log message slot; it is equal to the total size of the log buffer including the bf_log struct,
- msgs: the array of log messages,
- should we add CRC or hash or signatures here?
The members of struct bf_log_msg:
- size: total size of bf_log_msg struct,
Does this include the actual message string?
- ts_nsec: timestamp expressed in nanoseconds starting from 0,
But what is 0? In coreboot, we log timestamps relative to the last reset. Which, if applied to our log ring-buffer, might make things confusing because it can contain messages from multiple boots.
- level: similar to syslog meaning; can be used to differentiate normal messages from debug messages; the exact interpretation depends on the current producer type specified in the bf_log.producer,
- facility: similar to syslog meaning; can be used to differentiate the sources of the messages, e.g. message produced by networking module; the exact interpretation depends on the current producer type specified in the bf_log.producer,
- msg_off: the log message offset in strings[],
- strings[0]: the beginning of log message type, similar to the facility member but NUL terminated string instead of integer; this will be used by, e.g., the GRUB2 for messages printed using grub_dprintf(),
I don't think this is a good idea. It seems you want to start a new spec that already supports two competing formats (the `facility` field and this string). I know it's sometimes hard to make everybody happy, but think we should decide for a single format. I'll try to find some time to read about this GRUB string and prior discussions.
- strings[msg_off]: the beginning of log message, NUL terminated string.
There is still not fully solved problem how the logs should be presented to the OS. On the UEFI platforms we can use config tables to do that. Then probably bf_log.next_bf_log_addr should not be used. On the ACPI and Device Tree platforms we can use these mechanisms to present the logs to the OSes. The situation gets more difficult if neither of these mechanisms are present. However, maybe we should not bother too much about that because probably these platforms getting less and less common.
There is also the question how a later-stage boot component would do this. It might not be easy for them to adapt ACPI, for instance (and if no `bf_log` chain is set up yet, it can't extend that either). Maybe just leave this open. Beside references to the big ones of course, which may need some assigned number (UEFI? ACPI?).
Nico