The road towards LTO: an update and a few question - coreboot

23 Feb 2024


      Hi community
I picked up the good work done by Jacob Garber to implement link time
optimization. The size gains are very substantial, with sometimes up to 15%
reduction in binary size! Besides the size benefit it can make us a bit
more lazy programmers: for instance we often put static inline functions in
headers guarded by preprocessing depending on Kconfig options to not
generate code we won't use. With LTO that can be relaxed a bit and the
condition to do nothing can come from a separate linked object (at least I
presume, I haven't tested).
So to do LTO, both with clang and GCC, you use the compiler as a linker
frontend, rather than invoke the linker directly. Object files that are
optionally generated with the -flto flag to generate LTO optimized
binaries. See https://review.coreboot.org/c/coreboot/+/40811
Jacob's patches dealt with GCC. I toyed around with clang. The clang linker
frontend only works with LTO if using the GOLD or LLD linker. GCC can use
the BFD linker we currently use, so that's an easy transition. Both those
linkers, GOLD and LLD are less capable at parsing linker scripts than the
BFD linker. Especially for the x86 bootblock, in which we do a lot of
stuff, like ID, FIT pointer, ECFW pointer, early 16 bit code, ...,  is
complicated and those linkers fail on the arithmetics doing all that
optimized linking.
So my question would be, whether we want to support non BFD linkers and
therefore not support all the magic in linker scripts we currently have. At
most we are a few bytes less optimal than possible by for instance setting
a good hardcoded size and offset of the early 16bit code rather than have
the linker script optimize that. As a side note, LTO size gainz are bigger
than what would be lost, by a large margin.
https://review.coreboot.org/c/coreboot/+/80724/1/src/arch/x86/bootblock.ld
is a crude example of how I got lld to be happy with the script.
If non BFD linkers are acceptable, how do you propose we deal with it? When
not using LTO you could in principle use a different linker of your
choosing. Should we just move to lld to make sure CI is happy about linker
scripts? Should it be an option, but then we won?t have CI being able to
test it except on a few boards put in configs/ ? Maybe have both, but then
change the default once LLD is known to work well enough? Sidenote: LLD is
faster than BFD but linking is not an expensive step in coreboot anyway.
Other more typical linker scripts like the ones you find on ARM and ARM64
platforms work just fine. There are some subtle differences e.g.
https://review.coreboot.org/c/coreboot/+/80735 where the heap needs to be
declared to not be loaded.
There are a few issues though. LLD does not like the RISCV relocation
symbols of libgcc.a... I read upstream bugreports about this issue, so
maybe that will resolve itself.
Same questions with LTO and CI. Should it be an option (IMO yes, as that's
what both Linux and u-boot do) and which default settings do we want. For
CI reasons LTO is a bit more strict and therefore useful, as the compiler
frontend does throw more warnings than the plain linker when linking.
What are your thoughts? Are there some caveats about linkers and LTO that
are worth knowing, before moving forward?
Kind regards
Arthur