Hi community
I picked up the good work done by Jacob Garber to implement link time optimization. The size gains are very substantial, with sometimes up to 15% reduction in binary size! Besides the size benefit it can make us a bit more lazy programmers: for instance we often put static inline functions in headers guarded by preprocessing depending on Kconfig options to not generate code we won't use. With LTO that can be relaxed a bit and the condition to do nothing can come from a separate linked object (at least I presume, I haven't tested).
So to do LTO, both with clang and GCC, you use the compiler as a linker frontend, rather than invoke the linker directly. Object files that are optionally generated with the -flto flag to generate LTO optimized binaries. See https://review.coreboot.org/c/coreboot/+/40811
Jacob's patches dealt with GCC. I toyed around with clang. The clang linker frontend only works with LTO if using the GOLD or LLD linker. GCC can use the BFD linker we currently use, so that's an easy transition. Both those linkers, GOLD and LLD are less capable at parsing linker scripts than the BFD linker. Especially for the x86 bootblock, in which we do a lot of stuff, like ID, FIT pointer, ECFW pointer, early 16 bit code, ..., is complicated and those linkers fail on the arithmetics doing all that optimized linking.
So my question would be, whether we want to support non BFD linkers and therefore not support all the magic in linker scripts we currently have. At most we are a few bytes less optimal than possible by for instance setting a good hardcoded size and offset of the early 16bit code rather than have the linker script optimize that. As a side note, LTO size gainz are bigger than what would be lost, by a large margin. https://review.coreboot.org/c/coreboot/+/80724/1/src/arch/x86/bootblock.ld is a crude example of how I got lld to be happy with the script.
If non BFD linkers are acceptable, how do you propose we deal with it? When not using LTO you could in principle use a different linker of your choosing. Should we just move to lld to make sure CI is happy about linker scripts? Should it be an option, but then we won?t have CI being able to test it except on a few boards put in configs/ ? Maybe have both, but then change the default once LLD is known to work well enough? Sidenote: LLD is faster than BFD but linking is not an expensive step in coreboot anyway.
Other more typical linker scripts like the ones you find on ARM and ARM64 platforms work just fine. There are some subtle differences e.g. https://review.coreboot.org/c/coreboot/+/80735 where the heap needs to be declared to not be loaded.
There are a few issues though. LLD does not like the RISCV relocation symbols of libgcc.a... I read upstream bugreports about this issue, so maybe that will resolve itself.
Same questions with LTO and CI. Should it be an option (IMO yes, as that's what both Linux and u-boot do) and which default settings do we want. For CI reasons LTO is a bit more strict and therefore useful, as the compiler frontend does throw more warnings than the plain linker when linking.
What are your thoughts? Are there some caveats about linkers and LTO that are worth knowing, before moving forward?
Kind regards
Arthur