ok, here is one more proposal for the eternal pre/post car discussion. I hope this covers all possibilities. It will allow all current implementations to remain unchanged, which is the nice part. But it sets a few simple rules for the next CPUs, such as core 2.
To recap: At some point stage1 needs to turn off CAR. The stack (which includes sysinfo and global variables) needs to be moved to RAM. We play a nice trick on Geode and K8, where there is RAM behind the CAR area: once it is time to turn off CAR, we just "push" the CAR into that RAM location, and continue as though nothing had happened. Although the K8 code is a bit more complex than this simple description, that is more or less what happens.
This trick is not 100% portable. On some systems (core 2) CAR is located where there is NEVER going to be any DRAM.
So, on some systems, once you disable CAR, the CAR data is gone, and so is your stack and sysinfo struct. That CAR data has to be moved to DRAM, and that data will be at a different address.
My concern has been that any pointers on the stack won't survive a stack move. I don't want to get Clever and try to adjust pointers on the stack because, in the limit, you can't distinguish pointers from data.
So we make a rule or two: 1. stage1 never returns. Hence there will never be an issue with stage1 returning to its caller. We need not worry about any stack being lost after stage 1 starts. 2. Stage 1 is the "outer loop". So any pointers in functions that stage1 calls don't matter. 3. in stage1, code should assume, post disable_car, that no pointers used pre disable_car are valid. So you can use pointers in stage1, you just can not share them across the disable_car boundary. Think of disable_car as a kind of "dead pointer zone" across which pointers don't survive. 4. There shall be no pointers in sysinfo. This makes sysinfo location-independent: the variables in sysinfo are valid no matter where sysinfo is locate. So, since sysinfo is location-independent, there are no issues with it being relocated as part of disable_car.
Given these rules, here is how disable_car can work on core 2 and others that don't back CAR with RAM. 1. compute a new stack area. The minimum size is the size of the stack. Note that stack contains sysinfo (at its base). It is acceptable to copy only the "active" stack; it is acceptable to copy all of CAR. It is acceptable to copy more data than the "active" stack and less data than all of CAR. This flexibility makes writing disable_car easier. 2. copy the data to the new stack area in ram 3. disable CAR 4. adjust the return address on stack if needed (unlikely, since we're executing from ROM, but who knows what the future may bring) and then return.
That's about it. This will work I think for core2. It requires zero changes for K8 and geode. I think we can close this discussion. We need to add comments to stage1 to explain proper pointer usage, i.e. that pointers created before disable_car is called must be adjusted or recomputed or not used after disable_car is called.
ron
On Thu, Sep 11, 2008 at 08:42:44AM -0700, ron minnich wrote: [...]
Given these rules, here is how disable_car can work on core 2 and others that don't back CAR with RAM.
- compute a new stack area. The minimum size is the size of the
stack. Note that stack contains sysinfo (at its base). It is acceptable to copy only the "active" stack; it is acceptable to copy all of CAR. It is acceptable to copy more data than the "active" stack and less data than all of CAR. This flexibility makes writing disable_car easier. 2. copy the data to the new stack area in ram 3. disable CAR 4. adjust the return address on stack if needed (unlikely, since we're executing from ROM, but who knows what the future may bring) and then return.
Thanks Ron. This write up was very useful to me.
Your proposal involves moving the stack with additional documentation to help prevent misuses. The idea of moving a stack makes my head hurt though - for example, it isn't immediately clear to me if gcc might take a pointer to a stack variable in some situations.
I'm wondering what you felt was lacking in your previous idea of passing a new stack location/execution address into disable_car()?
Also, in an earlier email I suggested the possibility of switching the stack before calling disable_car:
http://www.coreboot.org/pipermail/coreboot/2008-September/038611.html
do you know why that would not work?
Thanks, -Kevin
On Fri, Sep 12, 2008 at 5:10 AM, Kevin O'Connor kevin@koconnor.net wrote:
I'm wondering what you felt was lacking in your previous idea of passing a new stack location/execution address into disable_car()?
When I floated the idea last year everyone hated it :-)
Also, in an earlier email I suggested the possibility of switching the stack before calling disable_car:
http://www.coreboot.org/pipermail/coreboot/2008-September/038611.html
calling disable_car needs a stack. Where did it come from? You might as well put all the nasty code in one place.
ron
W.r.t. Kevin's question: if we wrote stage1 as follows:
stage1(){ . . . disable_car(); stage1_after_car(); }
How would people feel about that?
There are still real concerns in my mind about lingering addresses in registers that gcc might leave hanging around. The call nicely removes the worries.
ron
On 12.09.2008 18:19, ron minnich wrote:
W.r.t. Kevin's question: if we wrote stage1 as follows:
stage1(){ . . . disable_car(); stage1_after_car(); }
How would people feel about that?
There are still real concerns in my mind about lingering addresses in registers that gcc might leave hanging around. The call nicely removes the worries.
In theory, gcc is free to reload esp from a cached register after disable_car. That would cause pretty explosions due to the stack pointer being in a now invalid location.
Although I really don't like it, I think a safe and still readable way would be to rename stage1() to stage1_early(), rename stage1_after_car() to stage1_late() and rename disable_car() to disable_car_and_continue_at_stage1_late(). disable_car_and_continue_at_stage1_late() would be pure asm (probably even in a .S file and not just inline asm) and could take care of setting up the stack correctly and using the right calling convention. That would work regardless of gcc optimizations.
There is another related problem (but independent of CAR switching): How do we find out where our global variables live? Their location changes at a certain point in time (stack switch) and we need a function which is short, fast and will always tell us about whether the stack is at the in-CAR or post-CAR location. A one-bit variable would be enough, and even better if it is independent from CAR so we can set it during stack switch from asm.
Regards, Carl-Daniel
Carl-Daniel Hailfinger wrote:
On 12.09.2008 18:19, ron minnich wrote:
W.r.t. Kevin's question: if we wrote stage1 as follows:
stage1(){ . . . disable_car(); stage1_after_car(); }
How would people feel about that?
There are still real concerns in my mind about lingering addresses in registers that gcc might leave hanging around. The call nicely removes the worries.
In theory, gcc is free to reload esp from a cached register after disable_car. That would cause pretty explosions due to the stack pointer being in a now invalid location.
gcc is changing esp to absolute values? Why would it do that?
On Fri, Sep 12, 2008 at 12:20 PM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
In theory, gcc is free to reload esp from a cached register after disable_car. That would cause pretty explosions due to the stack pointer being in a now invalid location.
how can this be? I don't see how it can be. Can you provide an example?
My worry is simpler. The compiler is used to assuming that a pointer in a register is valid across function calls. After disable_car the stack will have moved on some CPUs. There is no choice. So all registers are junk after the call to disable_car().
Although I really don't like it, I think a safe and still readable way would be to rename stage1() to stage1_early(), rename stage1_after_car() to stage1_late() and rename disable_car() to disable_car_and_continue_at_stage1_late(). disable_car_and_continue_at_stage1_late() would be pure asm (probably even in a .S file and not just inline asm) and could take care of setting up the stack correctly and using the right calling convention. That would work regardless of gcc optimizations.
The names are too long.
Let's do this: stage0. disable_car. stage1.
stage1 called from disable car .The renaming is pretty easy. Stage0 finally has more work to do :-) The stage0 symbol in the .S file becomes stage0entry
There is another related problem (but independent of CAR switching): How do we find out where our global variables live?
You mean your global variable infrastructure? It should be designed so that given a valid ESP, the global area can be found. The global area should be aligned to CARSIZE. CARSIZE should be a power of two. . To find the globals I think it suffices to do something like this.
void *find_globals() { int i; u32 l;
l = (u32)&i; l &= ~ CARSIZE; return (void )l; }
no assembly.
Sorry to reopen this wound. We had it pretty resolved and then core2 came along ...
ron