[coreboot] [PATCH] more Kconfig default fixes

Sun Oct 11 01:02:38 CEST 2009

On 11.10.2009 00:28, Myles Watson wrote:
> On Sat, Oct 10, 2009 at 4:18 PM, Carl-Daniel Hailfinger
> <c-d.hailfinger.devel.2006 at gmx.net> wrote:
>   
>> On 10.10.2009 23:55, Myles Watson wrote:
>>     
>>>> One thing though: We're using lzma per default now if we're using
>>>> compression. This means each board needs at _least_ a stack size of
>>>> 0x8000.
>>>>         
>>> Why does LZMA use so much memory from the stack?  Couldn't we convert it to
>>> use heap so that it is easier to tell when you run out?  I guess that would
>>> make it dependent on a malloc call?
>>>
>>>       
>> Yes, the malloc dependency is what originally caused me to use the stack
>> instead.
>>     
> But we could check the position on the stack compared to the top of
> the stack before running LZMA, right?
>   

That's hideously complicated. On AMD Fam10, each AP gets its own
mini-stack at another location. The code for a stack checker is in v3
and even for the no-SMP case it is really fragile. Add multiple stack
sizes and multiple stack locations to it and the code will have to be
marked "Do not touch even if you think you understand it".
But yes, it can be done.

>>>> Those boards with STACK_SIZE being 0x2000 or 0x8000 are definitely
>>>> broken (and if they boot, they do by accident)
>>>>
>>>>         
>>> So since it's broken with Kconfig and newconfig, how can we decide what the
>>> correct stack size should be?
>>>
>>> What's the downside of a large stack?
>>>
>>>       
>> If you make the stack too large and you have multiple cores in CAR at
>> the same time, the CAR size is too small for all stacks.
>>     
> It seems like the safest way would be to serialize AP startup and have
> (at most) two stacks.
>   

That's a good idea as well, but I'm not sure our current infrastructure
can handle that. And how would the second and subsequent APs realize
that earlier incarnations already decompressed the CBFS member? All
those ROM accesses are wasting lots of time, so we only want to do them
once.

>>> What breakage should occur, heap corruption?
>>> Should we check before LZMA how much stack is left?
>>>
>>>       
>> The best choice would be to make sure no AP ever uses LZMA.
>> Let me explain. If one AP uses LZMA, it's very likely due to
>> decompressing some CBFS member. If one AP does that, it is very likely
>> all of them are doing it, probably even at the same time (at least we
>> had that problem in the past). LZMA decompression uses the destination
>> buffer as scratch pad which means if you are decompressing the same file
>> to the same destination on different cores, you are likely to get
>> garbage there in the meantime or even at the end. Plus, decompressing
>> one file once per AP is totally wasteful. Nobody wants that.
>> Two ways to solve this:
>> 1. Have the first AP decompress the CBFS member it wants to run and
>> block all other APs until decompression is complete (but you still need
>> a big stack for that first AP).
>> 2. Have the BSP decompress the CBFS member the APs want to run, then
>> start the APs. Big benefit here is you can avoid locking and the stack
>> of APs can stay small.
>>     
> I thought the problem was that this was before RAM is available, so
> the AP was decompressing into its cache.  You can't have the BSP do
> that for an AP, right?
>   

On AMD Fam10, the BKDG says that any CAR area can be either executable
or writable (mutually exclusive). You can decide which one you want on a
4k granularity with different MTRR types.
I do not know of any place where we decompress code into the CAR area
and I'd recommend against such stuff (mainly for non-technical reasons
you don't want to know).

Regards,
Carl-Daniel

-- 
Developer quote of the week: 
"We are juggling too many chainsaws and flaming arrows and tigers."