Fix AP_CODE_IN_CAR (only selected for two boards), STACK_SIZE, and HEAP_SIZE.
Signed-off-by: Myles Watson mylesgw@gmail.com
At this point my Tyan s2895 is happy with Kconfig. Boot tested.
Thanks, Myles
Myles Watson schrieb:
Fix AP_CODE_IN_CAR (only selected for two boards), STACK_SIZE, and HEAP_SIZE.
Signed-off-by: Myles Watson mylesgw@gmail.com
At this point my Tyan s2895 is happy with Kconfig. Boot tested.
Thanks, Myles
Looks (almost) good to me but I'd prefer someone else to check it, to.
One thing though: We're using lzma per default now if we're using compression. This means each board needs at _least_ a stack size of 0x8000.
Those boards with STACK_SIZE being 0x2000 or 0x8000 are definitely broken (and if they boot, they do by accident)
Stefan
On 10.10.2009 11:16, Stefan Reinauer wrote:
Myles Watson schrieb:
Fix AP_CODE_IN_CAR (only selected for two boards), STACK_SIZE, and HEAP_SIZE.
Signed-off-by: Myles Watson mylesgw@gmail.com
At this point my Tyan s2895 is happy with Kconfig. Boot tested.
Thanks, Myles
Looks (almost) good to me but I'd prefer someone else to check it, to.
One thing though: We're using lzma per default now if we're using compression. This means each board needs at _least_ a stack size of 0x8000.
Those boards with STACK_SIZE being 0x2000 or 0x8000 are definitely broken (and if they boot, they do by accident)
And the current infrastructure means we need that stack size per core if any of the APs perform any decompression (which is a bad idea in itself).
Regards, Carl-Daniel
Carl-Daniel Hailfinger wrote:
On 10.10.2009 11:16, Stefan Reinauer wrote:
Myles Watson schrieb:
Fix AP_CODE_IN_CAR (only selected for two boards), STACK_SIZE, and HEAP_SIZE.
Signed-off-by: Myles Watson mylesgw@gmail.com
At this point my Tyan s2895 is happy with Kconfig. Boot tested.
Thanks, Myles
Looks (almost) good to me but I'd prefer someone else to check it, to.
One thing though: We're using lzma per default now if we're using compression. This means each board needs at _least_ a stack size of 0x8000.
Those boards with STACK_SIZE being 0x2000 or 0x8000 are definitely broken (and if they boot, they do by accident)
And the current infrastructure means we need that stack size per core if any of the APs perform any decompression (which is a bad idea in itself).
Which systems do that? we should fix them...
On 10.10.2009 20:08, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
On 10.10.2009 11:16, Stefan Reinauer wrote:
Myles Watson schrieb:
Fix AP_CODE_IN_CAR (only selected for two boards), STACK_SIZE, and HEAP_SIZE.
Signed-off-by: Myles Watson mylesgw@gmail.com
At this point my Tyan s2895 is happy with Kconfig. Boot tested.
Thanks, Myles
Looks (almost) good to me but I'd prefer someone else to check it, to.
One thing though: We're using lzma per default now if we're using compression. This means each board needs at _least_ a stack size of 0x8000.
Those boards with STACK_SIZE being 0x2000 or 0x8000 are definitely broken (and if they boot, they do by accident)
And the current infrastructure means we need that stack size per core if any of the APs perform any decompression (which is a bad idea in itself).
Which systems do that? we should fix them...
AFAIK the Fam10 targets Zheng is having problems with.
Regards, Carl-Daniel
Myles Watson schrieb:
Fix AP_CODE_IN_CAR (only selected for two boards), STACK_SIZE, and
HEAP_SIZE.
Signed-off-by: Myles Watson mylesgw@gmail.com
At this point my Tyan s2895 is happy with Kconfig. Boot tested.
Thanks, Myles
Looks (almost) good to me but I'd prefer someone else to check it, to.
I was figuring that Patrick's scripts will check this stuff and we'll get closer and closer to newconfig. There are a few values things that I'd like to not match newconfig, though. For example, fam10 and k8 had different cache as RAM settings in newconfig, and they could share them.
One thing though: We're using lzma per default now if we're using compression. This means each board needs at _least_ a stack size of 0x8000.
Why does LZMA use so much memory from the stack? Couldn't we convert it to use heap so that it is easier to tell when you run out? I guess that would make it dependent on a malloc call?
Those boards with STACK_SIZE being 0x2000 or 0x8000 are definitely broken (and if they boot, they do by accident)
So since it's broken with Kconfig and newconfig, how can we decide what the correct stack size should be?
What's the downside of a large stack? What breakage should occur, heap corruption? Should we check before LZMA how much stack is left?
Thanks, Myles
On 10.10.2009 23:55, Myles Watson wrote:
Myles Watson schrieb:
Fix AP_CODE_IN_CAR (only selected for two boards), STACK_SIZE, and
HEAP_SIZE.
One thing though: We're using lzma per default now if we're using compression. This means each board needs at _least_ a stack size of 0x8000.
Why does LZMA use so much memory from the stack? Couldn't we convert it to use heap so that it is easier to tell when you run out? I guess that would make it dependent on a malloc call?
Yes, the malloc dependency is what originally caused me to use the stack instead.
Those boards with STACK_SIZE being 0x2000 or 0x8000 are definitely broken (and if they boot, they do by accident)
So since it's broken with Kconfig and newconfig, how can we decide what the correct stack size should be?
What's the downside of a large stack?
If you make the stack too large and you have multiple cores in CAR at the same time, the CAR size is too small for all stacks.
What breakage should occur, heap corruption? Should we check before LZMA how much stack is left?
The best choice would be to make sure no AP ever uses LZMA. Let me explain. If one AP uses LZMA, it's very likely due to decompressing some CBFS member. If one AP does that, it is very likely all of them are doing it, probably even at the same time (at least we had that problem in the past). LZMA decompression uses the destination buffer as scratch pad which means if you are decompressing the same file to the same destination on different cores, you are likely to get garbage there in the meantime or even at the end. Plus, decompressing one file once per AP is totally wasteful. Nobody wants that. Two ways to solve this: 1. Have the first AP decompress the CBFS member it wants to run and block all other APs until decompression is complete (but you still need a big stack for that first AP). 2. Have the BSP decompress the CBFS member the APs want to run, then start the APs. Big benefit here is you can avoid locking and the stack of APs can stay small.
Regards, Carl-Daniel
On Sat, Oct 10, 2009 at 4:18 PM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
On 10.10.2009 23:55, Myles Watson wrote:
One thing though: We're using lzma per default now if we're using compression. This means each board needs at _least_ a stack size of 0x8000.
Why does LZMA use so much memory from the stack? Couldn't we convert it to use heap so that it is easier to tell when you run out? I guess that would make it dependent on a malloc call?
Yes, the malloc dependency is what originally caused me to use the stack instead.
But we could check the position on the stack compared to the top of the stack before running LZMA, right?
Those boards with STACK_SIZE being 0x2000 or 0x8000 are definitely broken (and if they boot, they do by accident)
So since it's broken with Kconfig and newconfig, how can we decide what the correct stack size should be?
What's the downside of a large stack?
If you make the stack too large and you have multiple cores in CAR at the same time, the CAR size is too small for all stacks.
It seems like the safest way would be to serialize AP startup and have (at most) two stacks.
What breakage should occur, heap corruption? Should we check before LZMA how much stack is left?
The best choice would be to make sure no AP ever uses LZMA. Let me explain. If one AP uses LZMA, it's very likely due to decompressing some CBFS member. If one AP does that, it is very likely all of them are doing it, probably even at the same time (at least we had that problem in the past). LZMA decompression uses the destination buffer as scratch pad which means if you are decompressing the same file to the same destination on different cores, you are likely to get garbage there in the meantime or even at the end. Plus, decompressing one file once per AP is totally wasteful. Nobody wants that. Two ways to solve this:
- Have the first AP decompress the CBFS member it wants to run and
block all other APs until decompression is complete (but you still need a big stack for that first AP). 2. Have the BSP decompress the CBFS member the APs want to run, then start the APs. Big benefit here is you can avoid locking and the stack of APs can stay small.
I thought the problem was that this was before RAM is available, so the AP was decompressing into its cache. You can't have the BSP do that for an AP, right?
Thanks, Myles
On 11.10.2009 00:28, Myles Watson wrote:
On Sat, Oct 10, 2009 at 4:18 PM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
On 10.10.2009 23:55, Myles Watson wrote:
One thing though: We're using lzma per default now if we're using compression. This means each board needs at _least_ a stack size of 0x8000.
Why does LZMA use so much memory from the stack? Couldn't we convert it to use heap so that it is easier to tell when you run out? I guess that would make it dependent on a malloc call?
Yes, the malloc dependency is what originally caused me to use the stack instead.
But we could check the position on the stack compared to the top of the stack before running LZMA, right?
That's hideously complicated. On AMD Fam10, each AP gets its own mini-stack at another location. The code for a stack checker is in v3 and even for the no-SMP case it is really fragile. Add multiple stack sizes and multiple stack locations to it and the code will have to be marked "Do not touch even if you think you understand it". But yes, it can be done.
Those boards with STACK_SIZE being 0x2000 or 0x8000 are definitely broken (and if they boot, they do by accident)
So since it's broken with Kconfig and newconfig, how can we decide what the correct stack size should be?
What's the downside of a large stack?
If you make the stack too large and you have multiple cores in CAR at the same time, the CAR size is too small for all stacks.
It seems like the safest way would be to serialize AP startup and have (at most) two stacks.
That's a good idea as well, but I'm not sure our current infrastructure can handle that. And how would the second and subsequent APs realize that earlier incarnations already decompressed the CBFS member? All those ROM accesses are wasting lots of time, so we only want to do them once.
What breakage should occur, heap corruption? Should we check before LZMA how much stack is left?
The best choice would be to make sure no AP ever uses LZMA. Let me explain. If one AP uses LZMA, it's very likely due to decompressing some CBFS member. If one AP does that, it is very likely all of them are doing it, probably even at the same time (at least we had that problem in the past). LZMA decompression uses the destination buffer as scratch pad which means if you are decompressing the same file to the same destination on different cores, you are likely to get garbage there in the meantime or even at the end. Plus, decompressing one file once per AP is totally wasteful. Nobody wants that. Two ways to solve this:
- Have the first AP decompress the CBFS member it wants to run and
block all other APs until decompression is complete (but you still need a big stack for that first AP). 2. Have the BSP decompress the CBFS member the APs want to run, then start the APs. Big benefit here is you can avoid locking and the stack of APs can stay small.
I thought the problem was that this was before RAM is available, so the AP was decompressing into its cache. You can't have the BSP do that for an AP, right?
On AMD Fam10, the BKDG says that any CAR area can be either executable or writable (mutually exclusive). You can decide which one you want on a 4k granularity with different MTRR types. I do not know of any place where we decompress code into the CAR area and I'd recommend against such stuff (mainly for non-technical reasons you don't want to know).
Regards, Carl-Daniel
On Sat, Oct 10, 2009 at 5:02 PM, Carl-Daniel Hailfinger < c-d.hailfinger.devel.2006@gmx.net> wrote:
On 11.10.2009 00:28, Myles Watson wrote:
On Sat, Oct 10, 2009 at 4:18 PM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
On 10.10.2009 23:55, Myles Watson wrote:
One thing though: We're using lzma per default now if we're using compression. This means each board needs at _least_ a stack size of 0x8000.
Yes, the malloc dependency is what originally caused me to use the stack instead.
Maybe we ought to revisit it, then, since malloc already checks if it is running out of memory.
But we could check the position on the stack compared to the top of the stack before running LZMA, right?
That's hideously complicated. On AMD Fam10, each AP gets its own mini-stack at another location. The code for a stack checker is in v3 and even for the no-SMP case it is really fragile. Add multiple stack sizes and multiple stack locations to it and the code will have to be marked "Do not touch even if you think you understand it". But yes, it can be done.
I just meant compare against the top of all stacks (or the bottom of the heap.) Any checking is better than none.
Those boards with STACK_SIZE being 0x2000 or 0x8000 are definitely broken (and if they boot, they do by accident)
So since it's broken with Kconfig and newconfig, how can we decide what
the
correct stack size should be?
Ping.
It seems like the safest way would be to serialize AP startup and have (at most) two stacks.
That's a good idea as well, but I'm not sure our current infrastructure can handle that. And how would the second and subsequent APs realize that earlier incarnations already decompressed the CBFS member? All those ROM accesses are wasting lots of time, so we only want to do them once.
All I'm looking for is the shortest path to "not-broken". I'm open to suggestions.
Thanks, Myles
On Mon, Oct 12, 2009 at 8:36 PM, Myles Watson mylesgw@gmail.com wrote:
On Sat, Oct 10, 2009 at 5:02 PM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
On 11.10.2009 00:28, Myles Watson wrote:
On Sat, Oct 10, 2009 at 4:18 PM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
On 10.10.2009 23:55, Myles Watson wrote:
One thing though: We're using lzma per default now if we're using compression. This means each board needs at _least_ a stack size of 0x8000.
I think I need some clarification. When we're talking about STACK_SIZE and HEAP_SIZE it only refers to coreboot_ram, right? So unless you use a compressed payload, LZMA doesn't come into play. When you're using cache_as_ram, I don't see that we have a way of dividing the cache between stack and heap. Am I missing something?
Thanks, Myles
On Mon, Oct 12, 2009 at 7:36 PM, Myles Watson mylesgw@gmail.com wrote:
All I'm looking for is the shortest path to "not-broken". I'm open to suggestions.
So I've had this explained to me several times, let's try again.
In fact it's in the v2 code. It's just that the v2 code is so hard to read ...
1. BSP core 0 starts up in CAR. BSP core 0 sets up DRAM programming for ALL sockets. BPS gets CBFS files in RAM. BSP zeros memory attached to BSP. 2. BSP uses IPIs to set up all AP stacks and EIP. AP core 0s start up. They do what initialization they need (e.g. zero their own DRAM). They then stop. 2a. BSP waits for all AP core 0 to either stop or error out. 3. BPS starts up core>0 and sends IPIs to get APs to start up core>0. 4. All core 0s wait for all core>0 to stop or error out. 5. BSP waits for all AP core0 to stop again. 6. BSP continues to boot.
Is there a problem with this sequence?
thanks
ron
-----Original Message----- From: ron minnich [mailto:rminnich@gmail.com] Sent: Monday, October 12, 2009 9:37 PM To: Myles Watson Cc: Carl-Daniel Hailfinger; Stefan Reinauer; coreboot Subject: Re: [coreboot] [PATCH] more Kconfig default fixes
On Mon, Oct 12, 2009 at 7:36 PM, Myles Watson mylesgw@gmail.com wrote:
All I'm looking for is the shortest path to "not-broken". I'm open to suggestions.
So I've had this explained to me several times, let's try again.
In fact it's in the v2 code. It's just that the v2 code is so hard to read ...
- BSP core 0 starts up in CAR.
All other cores start up and put themselves to sleep ASAP.
BSP core 0 sets up DRAM programming for ALL sockets. BPS gets CBFS files in RAM. BSP zeros memory attached to BSP. 2. BSP uses IPIs to set up all AP stacks and EIP. AP core 0s start up. They do what initialization they need (e.g. zero their own DRAM). They then stop. 2a. BSP waits for all AP core 0 to either stop or error out. 3. BPS starts up core>0 and sends IPIs to get APs to start up core>0. 4. All core 0s wait for all core>0 to stop or error out. 5. BSP waits for all AP core0 to stop again. 6. BSP continues to boot.
Is there a problem with this sequence?
I don't think so.
The reason we're back on this topic is because we're having trouble nailing down the correct values for CONFIG_STACK and CONFIG_HEAP. I just noticed that they don't actually matter in CAR, so we have RAM by the time we even check those values.
Thanks, Myles
On Mon, Oct 12, 2009 at 8:45 PM, Myles Watson mylesgw@gmail.com wrote:
The reason we're back on this topic is because we're having trouble nailing down the correct values for CONFIG_STACK and CONFIG_HEAP. I just noticed that they don't actually matter in CAR, so we have RAM by the time we even check those values.
understood. OK, let's move along :-)
ron
On Mon, Oct 12, 2009 at 9:47 PM, ron minnich rminnich@gmail.com wrote:
On Mon, Oct 12, 2009 at 8:45 PM, Myles Watson mylesgw@gmail.com wrote:
The reason we're back on this topic is because we're having trouble
nailing
down the correct values for CONFIG_STACK and CONFIG_HEAP. I just noticed that they don't actually matter in CAR, so we have RAM by the time we
even
check those values.
understood. OK, let's move along :-)
So I guess the question is how should we make sure the stack and heap are sized correctly. Using malloc to allocate the memory for lzma makes sense, but it is used in CAR too, so that complicates our decision.
I guess in CAR we don't have much of a heap to run into, so everything is on the stack?
Myles
Myles Watson schrieb:
On Mon, Oct 12, 2009 at 9:47 PM, ron minnich <rminnich@gmail.com mailto:rminnich@gmail.com> wrote:
On Mon, Oct 12, 2009 at 8:45 PM, Myles Watson <mylesgw@gmail.com <mailto:mylesgw@gmail.com>> wrote: > The reason we're back on this topic is because we're having trouble nailing > down the correct values for CONFIG_STACK and CONFIG_HEAP. I just noticed > that they don't actually matter in CAR, so we have RAM by the time we even > check those values. understood. OK, let's move along :-)
So I guess the question is how should we make sure the stack and heap are sized correctly. Using malloc to allocate the memory for lzma makes sense, but it is used in CAR too, so that complicates our decision.
The plan might also be to drop malloc completely and replace it with more distinct memory areas. malloc has a strong touch of unpredictability.
I guess in CAR we don't have much of a heap to run into, so everything is on the stack?
Yes. (More or less)
On Mon, Oct 12, 2009 at 8:51 PM, Myles Watson mylesgw@gmail.com wrote:
So I guess the question is how should we make sure the stack and heap are sized correctly. Using malloc to allocate the memory for lzma makes sense, but it is used in CAR too, so that complicates our decision.
lzma decompressor gets a void * from the caller. Caller, if CAR, uses on-stack pointer. RAM code can, if desired, use malloc'ed memory?
I guess in CAR we don't have much of a heap to run into, so everything is on the stack?
yes.
ron
On Tue, Oct 13, 2009 at 8:59 AM, ron minnich rminnich@gmail.com wrote:
On Mon, Oct 12, 2009 at 8:51 PM, Myles Watson mylesgw@gmail.com wrote:
So I guess the question is how should we make sure the stack and heap are sized correctly. Using malloc to allocate the memory for lzma makes
sense,
but it is used in CAR too, so that complicates our decision.
lzma decompressor gets a void * from the caller. Caller, if CAR, uses on-stack pointer. RAM code can, if desired, use malloc'ed memory?
Not for the scratchpad. It's allocated on the stack of the ulzma function.
Thanks, Myles
On Tue, Oct 13, 2009 at 8:40 AM, Myles Watson mylesgw@gmail.com wrote:
On Tue, Oct 13, 2009 at 8:59 AM, ron minnich rminnich@gmail.com wrote:
On Mon, Oct 12, 2009 at 8:51 PM, Myles Watson mylesgw@gmail.com wrote:
So I guess the question is how should we make sure the stack and heap are sized correctly. Using malloc to allocate the memory for lzma makes sense, but it is used in CAR too, so that complicates our decision.
lzma decompressor gets a void * from the caller. Caller, if CAR, uses on-stack pointer. RAM code can, if desired, use malloc'ed memory?
Not for the scratchpad. It's allocated on the stack of the ulzma function.
yes, but I always felt that was fixable.
ron
lzma decompressor gets a void * from the caller. Caller, if CAR, uses on-stack pointer. RAM code can, if desired, use malloc'ed memory?
Not for the scratchpad. It's allocated on the stack of the ulzma
function.
yes, but I always felt that was fixable.
Sure. We could pass it a scratchpad pointer too.
Thanks, Myles
ron minnich wrote:
On Tue, Oct 13, 2009 at 8:40 AM, Myles Watson mylesgw@gmail.com wrote:
On Tue, Oct 13, 2009 at 8:59 AM, ron minnich rminnich@gmail.com wrote:
On Mon, Oct 12, 2009 at 8:51 PM, Myles Watson mylesgw@gmail.com wrote:
So I guess the question is how should we make sure the stack and heap are sized correctly. Using malloc to allocate the memory for lzma makes sense, but it is used in CAR too, so that complicates our decision.
lzma decompressor gets a void * from the caller. Caller, if CAR, uses on-stack pointer. RAM code can, if desired, use malloc'ed memory?
Not for the scratchpad. It's allocated on the stack of the ulzma function.
yes, but I always felt that was fixable.
By making two copies of the code that behave slightly different?
There's no benefit of using heap over using stack, so why bother?
Stefan
On Tue, Oct 13, 2009 at 9:26 AM, Stefan Reinauer stepan@coresystems.de wrote:
There's no benefit of using heap over using stack, so why bother?
beats me. I'm responding to something that people seem to feel is an issue.
ron
On Tue, Oct 13, 2009 at 11:08 AM, ron minnich rminnich@gmail.com wrote:
On Tue, Oct 13, 2009 at 9:26 AM, Stefan Reinauer stepan@coresystems.de wrote:
There's no benefit of using heap over using stack, so why bother?
beats me. I'm responding to something that people seem to feel is an issue.
:)
The issue is that it is forcing every platform to increase its stack size. The old default was 8K. The new default is 32K. I just wanted to make sure that it was clear why we're choosing 32K and that it wouldn't cause a problem for any boards to have their stack increased by such a large amount.
This is the same patch with 32K as the default stack size.
Signed-off-by: Myles Watson mylesgw@gmail.com
Thanks, Myles
This is the same patch with 32K as the default stack size.
Signed-off-by: Myles Watson mylesgw@gmail.com
Ping.
Thanks, Myles
On Fri, Oct 16, 2009 at 12:53 PM, ron minnich rminnich@gmail.com wrote:
Acked-by: Ronald G. Minnich rminnich@gmail.com
Rev 4793.
Thanks, Myles
ron minnich wrote:
On Mon, Oct 12, 2009 at 8:51 PM, Myles Watson mylesgw@gmail.com wrote:
So I guess the question is how should we make sure the stack and heap are sized correctly. Using malloc to allocate the memory for lzma makes sense, but it is used in CAR too, so that complicates our decision.
lzma decompressor gets a void * from the caller. Caller, if CAR, uses on-stack pointer. RAM code can, if desired, use malloc'ed memory?
We never call lzma while in CAR. Now that would be kind of silly, would it?
On 15.10.2009 13:04, Stefan Reinauer wrote:
ron minnich wrote:
On Mon, Oct 12, 2009 at 8:51 PM, Myles Watson mylesgw@gmail.com wrote:
So I guess the question is how should we make sure the stack and heap are sized correctly. Using malloc to allocate the memory for lzma makes sense, but it is used in CAR too, so that complicates our decision.
lzma decompressor gets a void * from the caller. Caller, if CAR, uses on-stack pointer. RAM code can, if desired, use malloc'ed memory?
We never call lzma while in CAR. Now that would be kind of silly, would it?
Well, originally ulmza() was designed to be runnable in CAR on the OLPC. That's why I picked a scratch pad size which would allow pretty good compression and still fit well into the stack we had during CAR on these boards. Part of the motivation may have been a misunderstanding, this was one of my first coreboot patches (or even the very first one).
I don't care where ulzma places its scratch space as long as it can get enough of it. If someone wants to use malloc() instead, check the variable mallocneeds which has the exact allocation size needed (that size depends on the parameters picked during compression).
Regards, Carl-Daniel
Carl-Daniel Hailfinger wrote:
On 15.10.2009 13:04, Stefan Reinauer wrote:
ron minnich wrote:
On Mon, Oct 12, 2009 at 8:51 PM, Myles Watson mylesgw@gmail.com wrote:
So I guess the question is how should we make sure the stack and heap are sized correctly. Using malloc to allocate the memory for lzma makes sense, but it is used in CAR too, so that complicates our decision.
lzma decompressor gets a void * from the caller. Caller, if CAR, uses on-stack pointer. RAM code can, if desired, use malloc'ed memory?
We never call lzma while in CAR. Now that would be kind of silly, would it?
Well, originally ulmza() was designed to be runnable in CAR on the OLPC.
What for? Decompressing to cache? This sounds a bit odd, with a 16kB scratchpad, and only 128KB cache.
Stefan
On 15.10.2009 14:28, Stefan Reinauer wrote:
Carl-Daniel Hailfinger wrote:
On 15.10.2009 13:04, Stefan Reinauer wrote:
ron minnich wrote:
On Mon, Oct 12, 2009 at 8:51 PM, Myles Watson mylesgw@gmail.com wrote:
So I guess the question is how should we make sure the stack and heap are sized correctly. Using malloc to allocate the memory for lzma makes sense, but it is used in CAR too, so that complicates our decision.
lzma decompressor gets a void * from the caller. Caller, if CAR, uses on-stack pointer. RAM code can, if desired, use malloc'ed memory?
We never call lzma while in CAR. Now that would be kind of silly, would it?
Well, originally ulmza() was designed to be runnable in CAR on the OLPC.
What for? Decompressing to cache? This sounds a bit odd, with a 16kB scratchpad, and only 128KB cache.
I didn't say it was a good idea. I had not understood coreboot design well enough to know that decompression would be run after CAR and thought coreboot was running decompression to RAM while the stack still lived in CAR.
Regards, Carl-Daniel
Myles Watson schrieb:
> But we could check the position on the stack compared to the top of > the stack before running LZMA, right? That's hideously complicated. On AMD Fam10, each AP gets its own mini-stack at another location. The code for a stack checker is in v3 and even for the no-SMP case it is really fragile. Add multiple stack sizes and multiple stack locations to it and the code will have to be marked "Do not touch even if you think you understand it". But yes, it can be done.
I just meant compare against the top of all stacks (or the bottom of the heap.) Any checking is better than none.
>>>> Those boards with STACK_SIZE being 0x2000 or 0x8000 are definitely >>>> broken (and if they boot, they do by accident) >>> So since it's broken with Kconfig and newconfig, how can we decide what the >>> correct stack size should be?
Ping.
0x8000 is the minimum for all boards. I think it should be the default.
Carl-Daniel Hailfinger wrote:
- Have the first AP decompress the CBFS member
- Have the BSP decompress the CBFS member the APs want to run
3. Parallelize decompression
//Peter hides
On Sat, Oct 10, 2009 at 3:18 PM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
If you make the stack too large and you have multiple cores in CAR at the same time, the CAR size is too small for all stacks.
Let's please go over this again. Last time I checked, this was the rule:
cores 1 and up don't run in CAR.
Is that true or not. If that is true, then we don't need to worry about this problem. Cores 1 and up don't run in CAR. They are started up by core 0 using DRAM.
The best choice would be to make sure no AP ever uses LZMA.
let's be clear here. When you say AP, do you mean "core > 0" or do mean an AP?
- Have the BSP decompress the CBFS member the APs want to run, then
start the APs. Big benefit here is you can avoid locking and the stack of APs can stay small.
Yes. And it's doable. And, I thought, at least when I worked on the v3 stuff, I did that. Did I not?
ron