[coreboot] GCC update broke AMD Fam10h boot

Aaron Durbin adurbin at chromium.org
Mon Mar 16 22:58:24 CET 2015


On Mon, Mar 16, 2015 at 4:49 PM, Timothy Pearson
<tpearson at raptorengineeringinc.com> wrote:
> On 03/16/2015 04:44 PM, Aaron Durbin wrote:
>>
>> On Mon, Mar 16, 2015 at 1:39 PM, Timothy Pearson
>> <tpearson at raptorengineeringinc.com>  wrote:
>>>
>>> On 03/16/2015 09:23 AM, Aaron Durbin wrote:
>>>>
>>>>
>>>> On Sun, Mar 15, 2015 at 2:04 PM, Timothy Pearson
>>>> <tpearson at raptorengineeringinc.com>   wrote:
>>>>>
>>>>>
>>>>> All,
>>>>>
>>>>> Just a heads up as there is no bugtracker for this project.  GIT commit
>>>>> 53c388fe, which updates the crossgcc GCC version from 4.8.3 to 4.9.2,
>>>>> breaks
>>>>> ramstage on AMD Fam10h systems (ramstage loads, sends its 0x39 POST
>>>>> code,
>>>>> but then goes into an infinite loop).  Downgrading the GCC version
>>>>> repairs
>>>>> the boot failure.
>>>>>
>>>>> Not sure if you want to revert that commit until someone can figure out
>>>>> what
>>>>> changed to cause the problem.
>>>>
>>>>
>>>>
>>>> Could post ramstage.elf from the two different builds somewhere? I'd
>>>> like to take a peak at what is in there.
>>>
>>>
>>>
>>> Sure:
>>> https://raptorengineeringinc.com/coreboot/built.tar.bz2
>>>
>>> Other oddities:
>>> GCC 4.8.3:
>>> normal/romstage                0x7ff80    stage        97345
>>> normal/ramstage                0x97c40    stage        154869
>>>
>>> GCC 4.9.2:
>>> normal/romstage                0x7ff80    stage        94773
>>> normal/ramstage                0x97240    stage        173942
>>>
>>> Note in particular, judging from the file sizes, that something seems to
>>> have been relocated from romstage to ramstage by the new gcc version.
>>>
>>
>> I noticed you had CONFIG_COVERAGE selected in both the builds. Could
>> you try not having that selected? I wonder if something changed in the
>> compiler on that front. But... I think I found a bigger issue.
>
>
> That shouldn't be a problem.  For reference, should CONFIG_COVERAGE be on or
> off for board status report builds?
>
>
>> $ nm ./gcc4.8.3/ramstage.debug | sort | grep -C 4 _bs_init_
>> 00146fc4 r pch_intel_wifi
>> 00146fd0 R cpu_drivers
>> 00146fd0 R epci_drivers
>> 00146fd0 r model_10xxx
>> 00146fdc R _bs_init_begin
>> 00146fdc r cbmem_bscb
>> 00146fdc R ecpu_drivers
>> 00146ff0 r gcov_bscb
>> 0014702c R _bs_init_end
>> 0014702c R pnp_conf_mode_870155_aa
>> 00147034 R pnp_conf_mode_a0a0_aa
>> 0014703c R pnp_conf_mode_8787_aa
>> 00147044 R pnp_conf_mode_7777_aa
>>
>> $ nm ./gcc4.9.2/ramstage.debug | sort | grep -C 4 _bs_init_
>> 001465c4 r pch_intel_wifi
>> 001465d0 R cpu_drivers
>> 001465d0 R epci_drivers
>> 001465d0 r model_10xxx
>> 001465dc R _bs_init_begin
>> 001465dc R ecpu_drivers
>> 001465e0 r cbmem_bscb
>> 00146600 r gcov_bscb
>> 0014663c R _bs_init_end
>> 00146640 R pnp_conf_mode_870155_aa
>> 00146648 R pnp_conf_mode_a0a0_aa
>> 00146650 R pnp_conf_mode_8787_aa
>> 00146658 R pnp_conf_mode_7777_a
>>
>> The boot state callbacks place the whole structure for each entry
>> between _bs_init_begin and _bs_init_end. For both binaries the size of
>> 0x14.
>
>
>> For the 4.8.3 compiled ramstage I see:
>> (gdb) p/x 0x0014702c - 0x00146fdc
>> $12 = 0x50
>> For the 4.9.2 compiled ramstage I see:
>> (gdb) p/x 0x014663c - 0x01465dc
>> $14 = 0x60
>>
>> 0x60 is not a multiple of 0x14 -- which is means things aren't cool.
>
>
> This makes perfect sense--whenever coreboot didn't hang outright it started
> infinitely spewing some message regarding a boot state callback already
> being complete.
>
>> Looking at the symbols it appears the compiler is aligning those
>> structures to 32-bytes for some reason...
>>
>> A quick hack is add ALIGN(32) to the linker script before
>> _bs_init_begin: src/arch/x86/ramstage.ld
>
>
> So I wonder if this is unique to AMD Fam10h or if a whole lot of other
> boards broke with the gcc update.  I wouldn't have even caught this if I
> hadn't checked out a new coreboot tree instead of copying over the existing
> tree with the prebuilt crossgcc, so we might be looking at a ticking
> timebomb that will go off as people start upgrading their crossgcc
> versions...

It all sorta depends. But the issue that _bs_init_begin does not equal
the address of the first bscb structure is bad news all around.

>
>> But I think we'll need to store pointers to the structures in order to
>> properly handle the situation where the compiler is effectively making
>> alignment/size decisions for some reason.
>
>
> I am not at all familiar with the code in question, so all I can do is offer
> to test.  Thanks for analysing the problem!

I might be able to whip up a patch, but it's harder than I first
thought because we were relying on arrays to be swept into those
regions. I'll have to think on this one or we'll just have to change
the API entirely for all the users.

>
>
>> -Aaron
>
>
>
> --
> Timothy Pearson
> Raptor Engineering
> +1 (415) 727-8645
> http://www.raptorengineeringinc.com



More information about the coreboot mailing list