Search results for "lto" - coreboot

Re: [coreboot] Thoughts or conclusions on coreboot and link time optimization (LTO)?

by Paul Menzel

Dear coreboot folks, Am Donnerstag, den 24.04.2014, 23:39 +0200 schrieb Paul Menzel: > in #coreboot on <irc.freenode.net> Stefan mentioned that link time > optimization (LTO) [1] might yield some speed improvements for coreboot > as the resulting firmware image might be smaller and therefore it takes > less time to read it from flash. > > As LTO has been greatly improved in GCC 4.9.0 and is all over the news, > has somebody already experimented with LTO or created patches or tools > for testing? > > Is somebody able to share conclusions already? That’d be very > interesting. looking further in April and May 2011 Scott Duplichan even sent a patch to add an option to enable LTO [2][3], but unfortunately it was not submitted. Back then on AMD Persimmon with SeaBIOS as payload the boot time was reduced from 690 ms to 640 ms. (Fun fact: UEFI firmware took 10 s on that board.) Thanks, Paul > [1] http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html > (search for `lto`) [2] http://www.coreboot.org/pipermail/coreboot/2011-April/064859.html [3] http://www.coreboot.org/pipermail/coreboot/2011-May/064874.html

10 years

Re: [RFH] Test link time optimization (LTO)

by Jacob Garber

On Tue, Apr 28, 2020 at 04:16:59PM +0200, Paul Menzel wrote: > Dear coreboot folks, > > > Despite ever increasing flash ROM chip sizes, small images are still desired > for faster boot times, faster flash times, and more space for > payloads, which is sometimes needed for adding several payloads > (including GRUB/TianoCore) or Linux payloads. > > Jacob Garber did great work to achieve this goal by enabling Link Time > Optimization (LTO) for coreboot [1] and libpayload [2]. While doing > this, he also found and fixed several bugs in the code base. > > Currently, it fails for AMD AGESA boards due handling of illegal globals. > > If somebody has a solution for that, that’d be great. > > It’d be great, if more people could test this, on your boards, and report > back. > > I propose, to submit the change-sets before the next release, and to enable > LTO for libpayload by default, and to disable it for coreboot by default. > > Big thanks again to Jacob for doing this. (My attempt doing this for GRUB > failed. ;-)) > > > Kind regards, > > Paul > > > [1]: https://review.coreboot.org/c/coreboot/+/38989 > [2]: https://review.coreboot.org/c/coreboot/+/38291 Hey Paul, thanks for the encouraging words. :) I tidied up the patches today and think they are ready for review. Not all targets compile yet, but all of the LTO framework is there so people can test it out with their boards. For example, the Thinkpad T500 boots successfully, and there is about a 10% reduction in stage size and compilation time. Right now I think it's safest to leave LTO disabled for libpayload and coreboot. GCC 9 is the first version where LTO is considered "production ready", and until we have that merged I think LTO should be considered experimental. Once people have had some time to try it out and work out any bugs we can move to making it the default in some cases. Cheers, Jacob

4 years

Thoughts or conclusions on coreboot and link time optimization (LTO)?

by Paul Menzel

Dear coreboot folks, in #coreboot on <irc.freenode.net> Stefan mentioned that link time optimization (LTO) [1] might yield some speed improvements for coreboot as the resulting firmware image might be smaller and therefore it takes less time to read it from flash. As LTO has been greatly improved in GCC 4.9.0 and is all over the news, has somebody already experimented with LTO or created patches or tools for testing? Is somebody able to share conclusions already? That’d be very interesting. Thanks, Paul [1] http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html (search for `lto`)

10 years

The road towards LTO: an update and a few question

by Arthur Heymans

Hi community I picked up the good work done by Jacob Garber to implement link time optimization. The size gains are very substantial, with sometimes up to 15% reduction in binary size! Besides the size benefit it can make us a bit more lazy programmers: for instance we often put static inline functions in headers guarded by preprocessing depending on Kconfig options to not generate code we won't use. With LTO that can be relaxed a bit and the condition to do nothing can come from a separate linked object (at least I presume, I haven't tested). So to do LTO, both with clang and GCC, you use the compiler as a linker frontend, rather than invoke the linker directly. Object files that are optionally generated with the -flto flag to generate LTO optimized binaries. See https://review.coreboot.org/c/coreboot/+/40811 Jacob's patches dealt with GCC. I toyed around with clang. The clang linker frontend only works with LTO if using the GOLD or LLD linker. GCC can use the BFD linker we currently use, so that's an easy transition. Both those linkers, GOLD and LLD are less capable at parsing linker scripts than the BFD linker. Especially for the x86 bootblock, in which we do a lot of stuff, like ID, FIT pointer, ECFW pointer, early 16 bit code, ..., is complicated and those linkers fail on the arithmetics doing all that optimized linking. So my question would be, whether we want to support non BFD linkers and therefore not support all the magic in linker scripts we currently have. At most we are a few bytes less optimal than possible by for instance setting a good hardcoded size and offset of the early 16bit code rather than have the linker script optimize that. As a side note, LTO size gainz are bigger than what would be lost, by a large margin. https://review.coreboot.org/c/coreboot/+/80724/1/src/arch/x86/bootblock.ld is a crude example of how I got lld to be happy with the script. If non BFD linkers are acceptable, how do you propose we deal with it? When not using LTO you could in principle use a different linker of your choosing. Should we just move to lld to make sure CI is happy about linker scripts? Should it be an option, but then we won?t have CI being able to test it except on a few boards put in configs/ ? Maybe have both, but then change the default once LLD is known to work well enough? Sidenote: LLD is faster than BFD but linking is not an expensive step in coreboot anyway. Other more typical linker scripts like the ones you find on ARM and ARM64 platforms work just fine. There are some subtle differences e.g. https://review.coreboot.org/c/coreboot/+/80735 where the heap needs to be declared to not be loaded. There are a few issues though. LLD does not like the RISCV relocation symbols of libgcc.a... I read upstream bugreports about this issue, so maybe that will resolve itself. Same questions with LTO and CI. Should it be an option (IMO yes, as that's what both Linux and u-boot do) and which default settings do we want. For CI reasons LTO is a bit more strict and therefore useful, as the compiler frontend does throw more warnings than the plain linker when linking. What are your thoughts? Are there some caveats about linkers and LTO that are worth knowing, before moving forward? Kind regards Arthur

2 months, 1 week

[RFH] Test link time optimization (LTO)

by Paul Menzel

Dear coreboot folks, Despite ever increasing flash ROM chip sizes, small images are still desired for faster boot times, faster flash times, and more space for payloads, which is sometimes needed for adding several payloads (including GRUB/TianoCore) or Linux payloads. Jacob Garber did great work to achieve this goal by enabling Link Time Optimization (LTO) for coreboot [1] and libpayload [2]. While doing this, he also found and fixed several bugs in the code base. Currently, it fails for AMD AGESA boards due handling of illegal globals. > Yes, this is a current limitation of LTO right now. Because the > object files are all lumped together into a single unit, all > information about where the symbols came from is lost, so > EXCLUDE_FILE is unable of excluding the AGESA objects from the > illegal_globals check. Tracing where a symbol came from has been > implemented in LLVM [0], but I'm not sure if it's on the roadmap for > GCC. For now it's probably best to disable LTO when compiling AGESA. > > [0]: https://llvm.org/devmtg/2017-10/slides/LTOLinkerScriptsEdlerVonKoch.pdf If somebody has a solution for that, that’d be great. It’d be great, if more people could test this, on your boards, and report back. I propose, to submit the change-sets before the next release, and to enable LTO for libpayload by default, and to disable it for coreboot by default. Big thanks again to Jacob for doing this. (My attempt doing this for GRUB failed. ;-)) Kind regards, Paul [1]: https://review.coreboot.org/c/coreboot/+/38989 [2]: https://review.coreboot.org/c/coreboot/+/38291

4 years

Re: The road towards LTO: an update and a few question

by Arthur Heymans

Hi Julius and Nico, Thanks for the feedback! Did you find out any particular (magic) construct we are currently using > that fails? Or is it the overall complexity of the script? > It trips on some arithmetics but I don't fully understand it yet, so my attempt was trial and error to get something linking and booting. Btw. can clang+lld with LTO still link against GCC objects, e.g. > libgfxinit? Linking is possible, but it cannot optimize that ofc. btw sometimes it's necessary to skip LTO on some specific C code like the arm eabi_compat.c . As a sidenote maybe https://github.com/AdaCore/gnat-llvm is an interesting route to have LTO in the clang with libgfxinit combo? Do you have a rough list of the types of things that the LLD linker > cannot deal with (e.g. there seems to be something about not using a > symbol before it was defined, like with BOOTBLOCK_TOP, but then it > doesn't seem to apply everywhere, e.g. for ID_SECTION it still seems > to work?), so we can get an idea what kind of limitations we'd be > accepting here for both current and future linker scripting? > I'm not sure yet why LLD isn't happy about some arithmetics in the linker script. I'll investigate to get a clearer picture. I > wouldn't mind some rewrites to the x86 bootblock script in general, > since some of it honestly seems unnecessarily convoluted anyway, but > it's more concerning if you need to drop features (like commenting out > all of those asserts at the end) when there's no way to make something > similar work with LLD. The asserts don't work because of the LTO and LLD combination. The _bootblock / _ebootblock symbols get optimized away and are somehow set to 0. Referring to them inside the code would fix that. https://review.coreboot.org/c/coreboot/+/71871 is also a way to deal with it. Also, are you sure that all the Arm boards are fine? Did you do a full > abuild and then also compare the images (with BUILD_TIMELESS) to make > sure the layouts didn't actually shift? We do a bunch of complicated > things in our linker scripts, I'm actually surprised that LLD would be > fine with everything besides the x86 bootblock (it didn't use to be at > all a couple of years ago, but I guess they may have improved it). > I only played with qemu on arm and arm64 and those still worked. Some more in depth comparison of the elf output is indeed needed. For instance with x86 stages it tripped the cbfstool assertion that loadable sections need to be consecutive. I think the other big question here is: why do we care about clang at > all? If GCC can do LTO with BFD, why don't we just stick with that? My > understanding was that people just added clang support to coreboot > "for fun" to see if it was possible, and we said okay as long as you > can do it without having to break any code. But now if we do need to > make meaningful code changes to support it, should we reexamine why we > want it at all? Is anyone actually using it in a production use case > (and if so, why)? I personally have clang set as the default compiler on my system using site-local/Kconfig. I prefer its error messages. It also generates a bit different errors/warnings as GCC, so that's always nice in CI. Clang trades blows with GCC on code size. Especially with LTO clang can sometimes result in 10% smaller binaries than GCC LTO binaries. Last time I checked Linux only supports LTO with clang (that might not be true anymore), although I'm not so sure why. If newer language support like rust or zig is desirable in the future, then LTO with clang will work more easily, as the same LLVM IR is used. One cool feature of clang is that it can do reflection on C structs with compiler builtins: https://review.coreboot.org/c/coreboot/+/72460 . So in my opinion it's a tooling option worth exploring. "For fun" often precedes production use :-) . Arthur On Sat, Feb 24, 2024 at 10:55 PM Nico Huber <nico.h(a)gmx.de> wrote: > Hi Arthur, > > this sounds very interesting. > > On 23.02.24 17:47, Arthur Heymans wrote: > > So my question would be, whether we want to support non BFD linkers and > > therefore not support all the magic in linker scripts we currently have. > > Did you find out any particular (magic) construct we are currently using > that fails? Or is it the overall complexity of the script? > > I was already wondering lately if we shouldn't split the complex x86 > linker script for different use cases (e.g. native, FSP, etc.). If we > had three scripts instead of one, we would sometimes have to make the > same change in multiple files. But, OTOH, I don't think we touch the > linker scripts that much. And every time I look into the x86 one, it > seems hard to find top and bottom. > > > If non BFD linkers are acceptable, how do you propose we deal with it? > When > > not using LTO you could in principle use a different linker of your > > choosing. Should we just move to lld to make sure CI is happy about > linker > > scripts? Should it be an option, but then we won?t have CI being able to > > test it except on a few boards put in configs/ ? Maybe have both, but > then > > change the default once LLD is known to work well enough? Sidenote: LLD > is > > faster than BFD but linking is not an expensive step in coreboot anyway. > > I wouldn't mind switching to LLD. Would prefer that we focus on a > single linker, though. There is some soothing feeling when one knows > that everybody is using the same toolchain, and chances that bugs are > discovered early are higher. > > Btw. can clang+lld with LTO still link against GCC objects, e.g. > libgfxinit? > > Nico > >

2 months, 1 week

Re: The road towards LTO: an update and a few question

by Nico Huber

Hi Arthur, this sounds very interesting. On 23.02.24 17:47, Arthur Heymans wrote: > So my question would be, whether we want to support non BFD linkers and > therefore not support all the magic in linker scripts we currently have. Did you find out any particular (magic) construct we are currently using that fails? Or is it the overall complexity of the script? I was already wondering lately if we shouldn't split the complex x86 linker script for different use cases (e.g. native, FSP, etc.). If we had three scripts instead of one, we would sometimes have to make the same change in multiple files. But, OTOH, I don't think we touch the linker scripts that much. And every time I look into the x86 one, it seems hard to find top and bottom. > If non BFD linkers are acceptable, how do you propose we deal with it? When > not using LTO you could in principle use a different linker of your > choosing. Should we just move to lld to make sure CI is happy about linker > scripts? Should it be an option, but then we won?t have CI being able to > test it except on a few boards put in configs/ ? Maybe have both, but then > change the default once LLD is known to work well enough? Sidenote: LLD is > faster than BFD but linking is not an expensive step in coreboot anyway. I wouldn't mind switching to LLD. Would prefer that we focus on a single linker, though. There is some soothing feeling when one knows that everybody is using the same toolchain, and chances that bugs are discovered early are higher. Btw. can clang+lld with LTO still link against GCC objects, e.g. libgfxinit? Nico

2 months, 1 week

Re: The road towards LTO: an update and a few question

by Julius Werner

Hi Arthur, First of all, thanks a lot for putting all this work into getting LTO working. The benefits really seem promising! Do you have a rough list of the types of things that the LLD linker cannot deal with (e.g. there seems to be something about not using a symbol before it was defined, like with BOOTBLOCK_TOP, but then it doesn't seem to apply everywhere, e.g. for ID_SECTION it still seems to work?), so we can get an idea what kind of limitations we'd be accepting here for both current and future linker scripting? I wouldn't mind some rewrites to the x86 bootblock script in general, since some of it honestly seems unnecessarily convoluted anyway, but it's more concerning if you need to drop features (like commenting out all of those asserts at the end) when there's no way to make something similar work with LLD. Also, are you sure that all the Arm boards are fine? Did you do a full abuild and then also compare the images (with BUILD_TIMELESS) to make sure the layouts didn't actually shift? We do a bunch of complicated things in our linker scripts, I'm actually surprised that LLD would be fine with everything besides the x86 bootblock (it didn't use to be at all a couple of years ago, but I guess they may have improved it). I think the other big question here is: why do we care about clang at all? If GCC can do LTO with BFD, why don't we just stick with that? My understanding was that people just added clang support to coreboot "for fun" to see if it was possible, and we said okay as long as you can do it without having to break any code. But now if we do need to make meaningful code changes to support it, should we reexamine why we want it at all? Is anyone actually using it in a production use case (and if so, why)?

2 months, 1 week

Re: [coreboot] use gcc 4.6.0 link time optimization to reduce coreboot execution time

by Stefan Reinauer

On 4/28/11 8:01 PM, Scott Duplichan wrote: > Adds a kconfig option to enable gcc link time optimization. > Link time optimization reduces both rom stage and ram stage > image size by removing unused functions and data. Reducing the > image size saves boot time by minimizing the flash memory read > and decompress time for ram stage. > > The option is off by default because of side effects such as > long build time and unusable dwarf2 debug output. This > option cuts persimmon+seabios DOS boot from SSD time from > 690 ms to 640 ms. Did you do some size tests with non-AGESA targets? Does lto work with our "driver"s? I hoped that once we have LTO available we could get rid of the distinction between drivers and objects and handle everything the way we handle drivers now, letting gcc remove the functions we don't need. > Signed-off-by: Scott Duplichan<scott(a)notabs.org> > Should we instead probe for availability of -flto in util/xcompile/xcompile and use it if it is there? What's the problem with dwarf2? GCC 4.6 uses mostly dwarf4 unless you manually force it to dwarf2. Will this still be a problem? > Index: Makefile > =================================================================== > --- Makefile (revision 6549) > +++ Makefile (working copy) > @@ -211,7 +211,7 @@ > de$(EMPTY)fine $(1)-objs_$(2)_template > $(obj)/$$(1).$(1).o: src/$$(1).$(2) $(obj)/config.h $(4) > @printf " CC $$$$(subst $$$$(obj)/,,$$$$(@))\n" > - $(CC) $(3) -MMD $$$$(CFLAGS) -c -o $$$$@ $$$$< > + $(CC) $(3) -MMD $$$$(CFLAGS) $$$$(LTO_OPTIMIZE) -c -o $$$$@ $$$$< Hm.. I think LTO_OPTIMIZE should be added to CFLAGS instead, that would make the patch a whole lot less intrusive. > Index: src/arch/x86/init/bootblock.ld > =================================================================== > --- src/arch/x86/init/bootblock.ld (revision 6549) > +++ src/arch/x86/init/bootblock.ld (working copy) > @@ -22,7 +22,6 @@ > OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386") > OUTPUT_ARCH(i386) > > -TARGET(binary) > SECTIONS > { > . = CONFIG_ROMBASE; Hm interesting... does this hurt LTO?

13 years

Re: [coreboot] [commit] r5286 - ...

by Stefan Reinauer

On 3/25/10 3:47 PM, Myles Watson wrote: >>> On 3/24/10 11:02 PM, repository service wrote: >>> >>>> -extern unsigned char AmlCode[]; >>>> +extern const acpi_header_t AmlCode; >>>> >>> And we're positive, this always does the right thing with gcc? >>> >> I am told that AmlCode is defined as array of (unsigned) char in >> some other file. Declaring it as some other type here is not >> valid C, and *will* break with GCC, with some options (-combine >> or LTO at least) -- it will not compile. >> > The biggest worry for me is incorrect execution. If it doesn't compile when > it breaks, then that's a good thing. > Well, if it breaks with LTO, we should fix it right away... I'll try and prepare something. -- coresystems GmbH • Brahmsstr. 16 • D-79104 Freiburg i. Br. Tel.: +49 761 7668825 • Fax: +49 761 7664613 Email: info(a)coresystems.de • http://www.coresystems.de/ Registergericht: Amtsgericht Freiburg • HRB 7656 Geschäftsführer: Stefan Reinauer • Ust-IdNr.: DE245674866

14 years, 1 month

coreboot search results for query "lto"