Right now we face the problem that we can't support processors which have a CAR area outside the usual RAM area. For those processors, we have to implement a stack copying and switching mechanism. Since gcc can't be told that the stack just moved, split stage1_main() into stage1_main() and stage1_late(). stage1_main() is still the entry point and will handle everything up to the point where we want to disable CAR. Switching the stack, disabling CAR and handling other tasks related to the stack switch (printk buffer move) is all wrapped in the switch_stack() function. switch_stack() will then call stage1_late() which is the second part of the old stage1_main().
Test booted on qemu only.
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Index: corebootv3-stackrebuild/arch/x86/stage1.c =================================================================== --- corebootv3-stackrebuild/arch/x86/stage1.c (Revision 925) +++ corebootv3-stackrebuild/arch/x86/stage1.c (Arbeitskopie) @@ -143,6 +143,9 @@ return ret; }
+void switch_stack(void); +void __attribute__((stdcall)) stage1_late(void); + /** * This function is called from assembler code with its argument on the * stack. Force the compiler to generate always correct code for this case. @@ -159,24 +162,8 @@ struct global_vars globvars; int ret; struct mem_file archive; - void *entry; struct node_core_id me; -#ifdef CONFIG_PAYLOAD_ELF_LOADER - struct mem_file result; - int elfboot_mem(struct lb_memory *mem, void *where, int size);
- /* Why can't we statically init this hack? */ - unsigned char faker[64]; - struct lb_memory *mem = (struct lb_memory*) faker; - - mem->tag = LB_TAG_MEMORY; - mem->size = 28; - mem->map[0].start.lo = mem->map[0].start.hi = 0; - mem->map[0].size.lo = (32*1024*1024); - mem->map[0].size.hi = 0; - mem->map[0].type = LB_MEM_RAM; -#endif /* CONFIG_PAYLOAD_ELF_LOADER */ - post_code(POST_STAGE1_MAIN);
/* before we do anything, we want to stop if we do not run @@ -234,6 +221,25 @@
printk(BIOS_DEBUG, "Done RAM init code\n");
+ /* Switch the stack location from CAR to RAM, rebuild the stack, + * disable CAR and continue at stage1_late(). This is all wrapped in + * switch_stack to make the code easier to follow. + */ + switch_stack(); +} + +/** + * This function is called to take care of switching and rebuilding the stack + * so that we can cope with processors which don't support a CAR area at low + * addresses where CAR could be copied to RAM without problems. + * After the stack has been rebuilt, we switch the stack pointer to the new + * location, move the printk buffer and cotinue at stage1_late(). + * + * NOTE: switch_stack may need to be reimplemented in processor-specific asm. + * TODO: Reevaluate whether printk_buffer_move should come before disable_car() + */ +void switch_stack() +{ /* Turn off Cache-As-Ram */ disable_car();
@@ -241,7 +247,37 @@ /* Move the printk buffer to PRINTK_BUF_ADDR_RAM */ printk_buffer_move((void *)PRINTK_BUF_ADDR_RAM, PRINTK_BUF_SIZE_RAM); #endif + stage1_late(); +}
+/** + * This function is the second part of stage1_main() after switching the stack + * and disabling CAR. + */ +void __attribute__((stdcall)) stage1_late() +{ + void *entry; + int ret; + struct mem_file archive; +#ifdef CONFIG_PAYLOAD_ELF_LOADER + struct mem_file result; + int elfboot_mem(struct lb_memory *mem, void *where, int size); + + /* Why can't we statically init this hack? */ + unsigned char faker[64]; + struct lb_memory *mem = (struct lb_memory*) faker; + + mem->tag = LB_TAG_MEMORY; + mem->size = 28; + mem->map[0].start.lo = mem->map[0].start.hi = 0; + mem->map[0].size.lo = (32*1024*1024); + mem->map[0].size.hi = 0; + mem->map[0].type = LB_MEM_RAM; +#endif /* CONFIG_PAYLOAD_ELF_LOADER */ + + // location and size of image. + init_archive(&archive); + entry = load_file_segments(&archive, "normal/stage2"); if (entry == (void *)-1) die("FATAL: Failed loading stage2.");
Here is a version I put together yesterday as a straw main.
summary:
stage1_main is now split into stage0_main and main(). stage0_main runs up to and including initram. It then calls disable_car.
disable_car does what it does now: copy CAR stack to ram stack, disable car, BUT: instead of a ret, it does a ljmp to main.
very little of disable_car has to change since we already copy stack to stack. We do need a new constant, RAM_STACK, or some such, so we know where stack goes in RAM.
I don't see a need for swtich_stack, since disable_car pretty much does that now -- it copies stacks, and all it need do is change %esp.
main is main. It calls everything else. I don't like names like xxx_eary and xxx_late-- they don't convey enough information. main() calls stage2 and payload and, later, code to load microcode (needed on k10) and possibly code to set up the resource maps (k8 and k10).
We should move the LAR pointer into global variables so it will be find-able after disable_car. We should not call the LAR init twice.
The bottom_of_stack will now work for any stack, not just CAR stack.Given a properly aligned stack it will always work.
This will fix issues with via and intel.
ron
ron minnich wrote:
Here is a version I put together yesterday as a straw main.
summary:
stage1_main is now split into stage0_main and main(). stage0_main runs up to and including initram. It then calls disable_car.
disable_car does what it does now: copy CAR stack to ram stack, disable car, BUT: instead of a ret, it does a ljmp to main.
Why not ret and do a call (or ljmp) from stage0_main to main(). It would make the code easier to follow and it would be easy to add code if anything were required between disable car and the jmp.
Marc
On 14.10.2008 18:31, Marc Jones wrote:
ron minnich wrote:
Here is a version I put together yesterday as a straw main.
summary:
stage1_main is now split into stage0_main and main(). stage0_main runs up to and including initram. It then calls disable_car.
disable_car does what it does now: copy CAR stack to ram stack, disable car, BUT: instead of a ret, it does a ljmp to main.
Why not ret and do a call (or ljmp) from stage0_main to main(). It would make the code easier to follow and it would be easy to add code if anything were required between disable car and the jmp.
The big problem is that you can't return after having switched the stack, so disable_car() must jump to (or call) the next function. Of course, if stack switching happens earlier and you just need to disable CAR, you're right.
Regards, Carl-Daniel
On Tue, Oct 14, 2008 at 9:31 AM, Marc Jones Marc.Jones@amd.com wrote:
Why not ret and do a call (or ljmp) from stage0_main to main(). It would make the code easier to follow and it would be easy to add code if anything were required between disable car and the jmp.
it's just not doable on some of these hardware implementations. It's desirable but not doable. There are going to be cases where disable_car resumes somewhere else. So we might as well just get it over with and put it in the code ;-)
ron
ron minnich wrote:
On Tue, Oct 14, 2008 at 9:31 AM, Marc Jones Marc.Jones@amd.com wrote:
Why not ret and do a call (or ljmp) from stage0_main to main(). It would make the code easier to follow and it would be easy to add code if anything were required between disable car and the jmp.
it's just not doable on some of these hardware implementations. It's desirable but not doable. There are going to be cases where disable_car resumes somewhere else. So we might as well just get it over with and put it in the code ;-)
What is the hardware limitation? I don't know gcc that well, but a lcall wouldn't get the correct address on the stack? If not, what about passing the return address(or pushing it on the stack). It is more complicated to do but it makes the code easier to understand.
A few other comments from this thread.
- returns bottom of stack, when passed an on-stack variable such as a
- parameter or automatic. Stack base must be STACKSIZE aligned
- and STACKSIZE must be a power of 2 for this to work.
- Luckily these rules have always been true.
Also, that was a good catch by Carl-Daniel about that cache stack size and alignments. CAR is closely related to MTRRs so I think that the MTRR rules should be used here. Or don't have any rules, it is up to the caller to get it correct for their CPU.
Marc
On 14.10.2008 21:10, Marc Jones wrote:
ron minnich wrote:
On Tue, Oct 14, 2008 at 9:31 AM, Marc Jones Marc.Jones@amd.com wrote:
Why not ret and do a call (or ljmp) from stage0_main to main(). It would make the code easier to follow and it would be easy to add code if anything were required between disable car and the jmp.
it's just not doable on some of these hardware implementations. It's desirable but not doable. There are going to be cases where disable_car resumes somewhere else. So we might as well just get it over with and put it in the code ;-)
What is the hardware limitation? I don't know gcc that well, but a lcall wouldn't get the correct address on the stack? If not, what about passing the return address(or pushing it on the stack). It is more complicated to do but it makes the code easier to understand.
Marc, you assume all hardware is sane. Returning is not the problem (actually, returning to the caller is easy to get right). However, gcc is free to access stack slots of the calling function without going through %esp. If such an access happens, the caller of disable_car() will access the old stack location (despite the new %esp) and since that old location isn't backed by RAM/CAR, it will get garbage. Of course, it would be possible to postprocess the gcc output to check for such code, but I consider that to be a really fragile construct which may fail anytime.
A few other comments from this thread.
- returns bottom of stack, when passed an on-stack variable such as a
- parameter or automatic. Stack base must be STACKSIZE aligned
- and STACKSIZE must be a power of 2 for this to work.
- Luckily these rules have always been true.
Also, that was a good catch by Carl-Daniel about that cache stack size and alignments. CAR is closely related to MTRRs so I think that the MTRR rules should be used here. Or don't have any rules, it is up to the caller to get it correct for their CPU.
Thanks! I hope to build something which will meet you approval and be cheap.
How expensive are MTRR readouts? Right now we run bottom_of_stack() once for every character in a printk(). Having an expensive operation there would slow down coreboot in an inacceptable manner.
Regards, Carl-Daniel
On Tue, Oct 14, 2008 at 9:31 AM, Marc Jones Marc.Jones@amd.com wrote:
Why not ret and do a call (or ljmp) from stage0_main to main(). It would make the code easier to follow and it would be easy to add code if anything were required between disable car and the jmp.
OK Marc is right again. Marc, you have to stop this ;-)
So the operation is something like this: disable_car() disable_car copies stacks. The top of stack has a valid ROM address -- this code runs from ROM.
Disable_car returns to stage1. stage1 calls the next function. My only worry is that happens to %ebp but ... this might be workable.
ron
On 14.10.2008 22:21, ron minnich wrote:
On Tue, Oct 14, 2008 at 9:31 AM, Marc Jones Marc.Jones@amd.com wrote:
Why not ret and do a call (or ljmp) from stage0_main to main(). It would make the code easier to follow and it would be easy to add code if anything were required between disable car and the jmp.
OK Marc is right again. Marc, you have to stop this ;-)
So the operation is something like this: disable_car() disable_car copies stacks. The top of stack has a valid ROM address -- this code runs from ROM.
Disable_car returns to stage1. stage1 calls the next function. My only worry is that happens to %ebp but ... this might be workable.
Calling anything after disable_car() returns can only be done reliably if the stack has not moved. You see, gcc is free to reorder stuff as it sees fit and it could insert almost anything between disable_car() and the call to stage1.
Regards, Carl-Daniel
On Tue, Oct 14, 2008 at 2:16 PM, Carl-Daniel Hailfinger
Calling anything after disable_car() returns can only be done reliably if the stack has not moved. You see, gcc is free to reorder stuff as it sees fit and it could insert almost anything between disable_car() and the call to stage1.
you're right too.
In any event, I have never been comfortable with returning from disable_car(), so away we go.
ron
ron minnich wrote:
On Tue, Oct 14, 2008 at 2:16 PM, Carl-Daniel Hailfinger
Calling anything after disable_car() returns can only be done reliably if the stack has not moved. You see, gcc is free to reorder stuff as it sees fit and it could insert almost anything between disable_car() and the call to stage1.
you're right too.
In any event, I have never been comfortable with returning from disable_car(), so away we go.
OK, comment below the call that it should never return and/or put a die() there as well to catch it if it did come back.
Marc
On 14.10.2008 23:35, Marc Jones wrote:
ron minnich wrote:
On Tue, Oct 14, 2008 at 2:16 PM, Carl-Daniel Hailfinger
Calling anything after disable_car() returns can only be done reliably if the stack has not moved. You see, gcc is free to reorder stuff as it sees fit and it could insert almost anything between disable_car() and the call to stage1.
you're right too.
In any event, I have never been comfortable with returning from disable_car(), so away we go.
OK, comment below the call that it should never return and/or put a die() there as well to catch it if it did come back.
Will do so.
Regards, Carl-Daniel
On 14.10.2008 17:43, ron minnich wrote:
Here is a version I put together yesterday as a straw main.
Nice. I'll comment inline.
summary:
stage1_main is now split into stage0_main and main(). stage0_main runs up to and including initram. It then calls disable_car.
I believe the stage0_main name is misleading. After all, stage0 is pure asm and lives in its own .S file.
disable_car does what it does now: copy CAR stack to ram stack, disable car, BUT: instead of a ret, it does a ljmp to main.
We agree on the concept of calling/jumping to the second part of the old stage1_main().
very little of disable_car has to change since we already copy stack to stack. We do need a new constant, RAM_STACK, or some such, so we know where stack goes in RAM.
Yes.
I don't see a need for swtich_stack, since disable_car pretty much does that now -- it copies stacks, and all it need do is change %esp.
switch_stack was intended to act as a placeholder for everything related to stack switching.
main is main. It calls everything else. I don't like names like xxx_eary and xxx_late-- they don't convey enough information. main() calls stage2 and payload and, later, code to load microcode (needed on k10) and possibly code to set up the resource maps (k8 and k10).
I have no strong opinion on the names of functions as long as we don't actively confuse developers.
We should move the LAR pointer into global variables so it will be find-able after disable_car. We should not call the LAR init twice.
Agreed. Can do. Though if we want to store anything in global variables, we have to be careful not to store pointers to something on the stack.
The bottom_of_stack will now work for any stack, not just CAR stack.Given a properly aligned stack it will always work.
Sorry, it won't. Well, it might work by accident for some processors in some pieces of code, but it will definitely fail on real hardware somewhere before/in stage2. I just can't see how you manage to have the stack aligned at a 512k boundary between 640k and 1023k.
This will fix issues with via and intel.
I took the liberty of diffing your code against svn HEAD.
Index: stage1.c
--- stage1.c (Revision 925) +++ stage1.c (Arbeitskopie) @@ -31,13 +31,6 @@ #include <cpu.h> #include <multiboot.h>
-#ifdef CONFIG_PAYLOAD_ELF_LOADER -/* ah, well, what a mess! This is a hard code. FIX ME but how?
- By getting rid of ELF ...
- */
-#define UNCOMPRESS_AREA (0x400000) -#endif /* CONFIG_PAYLOAD_ELF_LOADER */
/* these prototypes should go into headers */ void uart_init(void); void die(const char *msg);
Actually, getting rid of the ELF loader is not something I'm opposed to. But IIRC Stefan wants to be have LAR fixed first. My patch allows us to keep the code even in the stack switch case.
@@ -67,14 +60,27 @@
}
-/*
- The name is slightly misleading because this is the initial stack pointer,
- not the address of the first element on the stack.
+/**
- returns bottom of stack, when passed an on-stack variable such as a
- parameter or automatic. Stack base must be STACKSIZE aligned
- and STACKSIZE must be a power of 2 for this to work.
- Luckily these rules have always been true.
These rules have been violated on so many levels for quite some time or will need to be violated: - Fam10h AP stacks are byte-aligned, not even aligned to 16 bytes or so. - K8 and Fam10h BSP stacks have a maximum CAR size of 48k (not a power of 2) - We must use 48k CAR for newer multicore multiprocessor machines - Although we still can align BSP stacks to a 64k boundary, lzma decompression scratchpad and other stack hogs can cause the stack to grow larger than 64k, giving you a stack base location which is off by 64k or more.
What might work, though, is comparing the current stack location against the predefined values of CAR stack location and the computed value of RAM stack location.
Part of your comment ("passed an on-stack variable") has been obsoleted by your code which can work without that.
- Recall that on stacks that grow down, "bottom of stack" is really
- at "top of memory"!
- This function should work for any stack -- CAR or non-CAR.
*/
void *bottom_of_stack(void) {
- /* -4 because CONFIG_CARBASE + CONFIG_CARSIZE - 4 is initial %esp */
- return (void *)(CONFIG_CARBASE + CONFIG_CARSIZE - 4);
- /* this is weird, eh? Assigning something its own address.
* But it means that we have a %esp value in 'stack'
*/
- u32 stack = (u32) &stack;
- stack &= STACKSIZE;
- stack += STACKSIZE;
- /* -4 because -4 is initial %esp */
- stack -= 4;
- return (void *) stack;
}
struct global_vars *global_vars(void) @@ -85,7 +91,7 @@ void global_vars_init(struct global_vars *globvars) { memset(globvars, 0, sizeof(struct global_vars));
- *(struct global_vars **)(bottom_of_stack() - sizeof(struct global_vars *)) = globvars;
- *(struct global_vars **)(bottom_of_stack(vars) - sizeof(struct global_vars *)) = globvars;
#ifdef CONFIG_CONSOLE_BUFFER /* Initialize the printk buffer. */ printk_buffer_init(); @@ -117,25 +123,6 @@ return ret; }
-#ifdef CONFIG_PAYLOAD_ELF_LOADER -/* until we get rid of elf */ -int legacy(struct mem_file *archive, char *name, void *where, struct lb_memory *mem) -{
- int ret;
- int elfboot_mem(struct lb_memory *mem, void *where, int size);
- ret = copy_file(archive, name, where);
- if (ret) {
printk(BIOS_ERR, "'%s' found, but could not load it.\n", name);
- }
- ret = elfboot_mem(mem, where, archive->reallen);
- printk(BIOS_ERR, "elfboot_mem returns %d\n", ret);
- return -1;
-} -#endif /* CONFIG_PAYLOAD_ELF_LOADER */
static int run_address_multiboot(void *f) { int ret, dummy; @@ -154,31 +141,15 @@
- that we are restarting after some sort of reconfiguration. Note that we could use it on geode but
- do not at present.
*/ -void __attribute__((stdcall)) stage1_main(u32 bist, u32 init_detected) +void __attribute__((stdcall)) stage0_main(u32 bist, u32 init_detected)
stage0 is asm. This is stage1_main() if anything.
{ struct global_vars globvars; int ret; struct mem_file archive; void *entry; struct node_core_id me; -#ifdef CONFIG_PAYLOAD_ELF_LOADER
- struct mem_file result;
- int elfboot_mem(struct lb_memory *mem, void *where, int size);
- post_code(POST_STAGE0_MAIN);
- /* Why can't we statically init this hack? */
- unsigned char faker[64];
- struct lb_memory *mem = (struct lb_memory*) faker;
- mem->tag = LB_TAG_MEMORY;
- mem->size = 28;
- mem->map[0].start.lo = mem->map[0].start.hi = 0;
- mem->map[0].size.lo = (32*1024*1024);
- mem->map[0].size.hi = 0;
- mem->map[0].type = LB_MEM_RAM;
-#endif /* CONFIG_PAYLOAD_ELF_LOADER */
- post_code(POST_STAGE1_MAIN);
- /* before we do anything, we want to stop if we do not run
- on the bootstrap processor.
- stop_ap is responsible for NOT stopping the BSP
@@ -235,8 +206,18 @@ printk(BIOS_DEBUG, "Done RAM init code\n");
/* Turn off Cache-As-Ram */
- /* as a side effect this calls main() */
- /* and copies global_vars from the CAR stack to the RAM stack */ disable_car();
+}
+/** this is called from disable_car. It calls stage2 (which returns to it)
- and hence is not called 'stage1'
- */
+void __attribute__((stdcall)) main(u32 bist, u32 init_detected) +{
#ifdef CONFIG_CONSOLE_BUFFER /* Move the printk buffer to PRINTK_BUF_ADDR_RAM */ printk_buffer_move((void *)PRINTK_BUF_ADDR_RAM, PRINTK_BUF_SIZE_RAM); @@ -251,12 +232,6 @@
printk(BIOS_DEBUG, "Stage2 code done.\n");
-#ifdef CONFIG_PAYLOAD_ELF_LOADER
- ret = find_file(&archive, "normal/payload", &result);
- if (! ret)
legacy(&archive, "normal/payload", (void *)UNCOMPRESS_AREA, mem);
-#endif /* CONFIG_PAYLOAD_ELF_LOADER */
- entry = load_file_segments(&archive, "normal/payload"); if (entry != (void*)-1) { /* Final coreboot call before handing off to the payload. */
@@ -269,3 +244,4 @@ }
Regards, Carl-Daniel
On Tue, Oct 14, 2008 at 9:38 AM, Carl-Daniel Hailfinger
I believe the stage0_main name is misleading. After all, stage0 is pure asm and lives in its own .S file.
let's call it stage1 then and main()
I don't see a need for swtich_stack, since disable_car pretty much does that now -- it copies stacks, and all it need do is change %esp.
switch_stack was intended to act as a placeholder for everything related to stack switching.
OK but just don't forget -- it's hard to call switch_stack. I think it might as well remain where it is -- inline assembly.
I have no strong opinion on the names of functions as long as we don't actively confuse developers.
We should move the LAR pointer into global variables so it will be find-able after disable_car. We should not call the LAR init twice.
Agreed. Can do. Though if we want to store anything in global variables, we have to be careful not to store pointers to something on the stack.
Good point. But I think we're safe on that issue.
The bottom_of_stack will now work for any stack, not just CAR stack.Given a properly aligned stack it will always work.
Sorry, it won't. Well, it might work by accident for some processors in some pieces of code, but it will definitely fail on real hardware somewhere before/in stage2. I just can't see how you manage to have the stack aligned at a 512k boundary between 640k and 1023k.
512k boundary? That code should work if we align to, e.g., an 8k boundary with an 8k stack. Do we ever need more than an 8k stack? I hope not :-)
I took the liberty of diffing your code against svn HEAD.
Index: stage1.c
--- stage1.c (Revision 925) +++ stage1.c (Arbeitskopie) @@ -31,13 +31,6 @@ #include <cpu.h> #include <multiboot.h>
-#ifdef CONFIG_PAYLOAD_ELF_LOADER -/* ah, well, what a mess! This is a hard code. FIX ME but how?
- By getting rid of ELF ...
- */
-#define UNCOMPRESS_AREA (0x400000) -#endif /* CONFIG_PAYLOAD_ELF_LOADER */
/* these prototypes should go into headers */ void uart_init(void); void die(const char *msg);
Actually, getting rid of the ELF loader is not something I'm opposed to. But IIRC Stefan wants to be have LAR fixed first. My patch allows us to keep the code even in the stack switch case.
yeah, let's keep ELF, that's just my obsession with killing it :-)
@@ -67,14 +60,27 @@
}
-/*
- The name is slightly misleading because this is the initial stack pointer,
- not the address of the first element on the stack.
+/**
- returns bottom of stack, when passed an on-stack variable such as a
- parameter or automatic. Stack base must be STACKSIZE aligned
- and STACKSIZE must be a power of 2 for this to work.
- Luckily these rules have always been true.
These rules have been violated on so many levels for quite some time or will need to be violated:
- Fam10h AP stacks are byte-aligned, not even aligned to 16 bytes or so.
- K8 and Fam10h BSP stacks have a maximum CAR size of 48k (not a power of 2)
- We must use 48k CAR for newer multicore multiprocessor machines
- Although we still can align BSP stacks to a 64k boundary, lzma
decompression scratchpad and other stack hogs can cause the stack to grow larger than 64k, giving you a stack base location which is off by 64k or more.
oh. Wow. OK, let's go with a different bottom_of_stack function. Did not realize that LZMA was so stack-hungry.
What might work, though, is comparing the current stack location against the predefined values of CAR stack location and the computed value of RAM stack location.
yes.
If the stack is > CARBASE, then it is the CARBASE + CARSIZE - 4. If < carbase, it is the RAMBASE + RAMSIZE - 4. Done. But let's make it flexible.
Part of your comment ("passed an on-stack variable") has been obsoleted by your code which can work without that.
oops.
- Recall that on stacks that grow down, "bottom of stack" is really
- at "top of memory"!
- This function should work for any stack -- CAR or non-CAR.
*/
void *bottom_of_stack(void) {
/* -4 because CONFIG_CARBASE + CONFIG_CARSIZE - 4 is initial %esp */
return (void *)(CONFIG_CARBASE + CONFIG_CARSIZE - 4);
/* this is weird, eh? Assigning something its own address.
* But it means that we have a %esp value in 'stack'
*/
u32 stack = (u32) &stack;
stack &= STACKSIZE;
stack += STACKSIZE;
/* -4 because -4 is initial %esp */
stack -= 4;
return (void *) stack;
}
ok, it won't work, but you have to admit it's kind of slick.
Do you want to take one more pass at this? Your ideas were more attuned to hardware reality than mine. I think we're close and we need to get Corey supported on via very soon :-)
ron
On 14.10.2008 18:55, ron minnich wrote:
On Tue, Oct 14, 2008 at 9:38 AM, Carl-Daniel Hailfinger
I believe the stage0_main name is misleading. After all, stage0 is pure asm and lives in its own .S file.
let's call it stage1 then and main()
Works for me.
I don't see a need for swtich_stack, since disable_car pretty much does that now -- it copies stacks, and all it need do is change %esp.
switch_stack was intended to act as a placeholder for everything related to stack switching.
OK but just don't forget -- it's hard to call switch_stack. I think it might as well remain where it is -- inline assembly.
Agreed. It was just a skeleton to show others the intended control flow.
I have no strong opinion on the names of functions as long as we don't actively confuse developers.
We should move the LAR pointer into global variables so it will be find-able after disable_car. We should not call the LAR init twice.
Agreed. Can do. Though if we want to store anything in global variables, we have to be careful not to store pointers to something on the stack.
Good point. But I think we're safe on that issue.
Good. A manual code review of global variable additions will work nicely to avoid that.
The bottom_of_stack will now work for any stack, not just CAR stack.Given a properly aligned stack it will always work.
Sorry, it won't. Well, it might work by accident for some processors in some pieces of code, but it will definitely fail on real hardware somewhere before/in stage2. I just can't see how you manage to have the stack aligned at a 512k boundary between 640k and 1023k.
512k boundary? That code should work if we align to, e.g., an 8k boundary with an 8k stack. Do we ever need more than an 8k stack? I hope not :-)
I took the liberty of diffing your code against svn HEAD.
Index: stage1.c
--- stage1.c (Revision 925) +++ stage1.c (Arbeitskopie) @@ -31,13 +31,6 @@ #include <cpu.h> #include <multiboot.h>
-#ifdef CONFIG_PAYLOAD_ELF_LOADER -/* ah, well, what a mess! This is a hard code. FIX ME but how?
- By getting rid of ELF ...
- */
-#define UNCOMPRESS_AREA (0x400000) -#endif /* CONFIG_PAYLOAD_ELF_LOADER */
/* these prototypes should go into headers */ void uart_init(void); void die(const char *msg);
Actually, getting rid of the ELF loader is not something I'm opposed to. But IIRC Stefan wants to be have LAR fixed first. My patch allows us to keep the code even in the stack switch case.
yeah, let's keep ELF, that's just my obsession with killing it :-)
Oh, I'd also see it go away rather sooner than later.
@@ -67,14 +60,27 @@
}
-/*
- The name is slightly misleading because this is the initial stack pointer,
- not the address of the first element on the stack.
+/**
- returns bottom of stack, when passed an on-stack variable such as a
- parameter or automatic. Stack base must be STACKSIZE aligned
- and STACKSIZE must be a power of 2 for this to work.
- Luckily these rules have always been true.
These rules have been violated on so many levels for quite some time or will need to be violated:
- Fam10h AP stacks are byte-aligned, not even aligned to 16 bytes or so.
- K8 and Fam10h BSP stacks have a maximum CAR size of 48k (not a power of 2)
- We must use 48k CAR for newer multicore multiprocessor machines
- Although we still can align BSP stacks to a 64k boundary, lzma
decompression scratchpad and other stack hogs can cause the stack to grow larger than 64k, giving you a stack base location which is off by 64k or more.
oh. Wow. OK, let's go with a different bottom_of_stack function. Did not realize that LZMA was so stack-hungry.
What might work, though, is comparing the current stack location against the predefined values of CAR stack location and the computed value of RAM stack location.
yes.
If the stack is > CARBASE, then it is the CARBASE + CARSIZE - 4. If < carbase, it is the RAMBASE + RAMSIZE - 4. Done. But let's make it flexible.
Hm yes, something along these lines. Which leads to an interesting question: Where do we want the stack if we have to move it? Top of memory?
Part of your comment ("passed an on-stack variable") has been obsoleted by your code which can work without that.
oops.
I always take it as a good sign if the code is ahead of the comments.
- Recall that on stacks that grow down, "bottom of stack" is really
- at "top of memory"!
- This function should work for any stack -- CAR or non-CAR.
*/
void *bottom_of_stack(void) {
/* -4 because CONFIG_CARBASE + CONFIG_CARSIZE - 4 is initial %esp */
return (void *)(CONFIG_CARBASE + CONFIG_CARSIZE - 4);
/* this is weird, eh? Assigning something its own address.
* But it means that we have a %esp value in 'stack'
*/
u32 stack = (u32) &stack;
stack &= STACKSIZE;
stack += STACKSIZE;
/* -4 because -4 is initial %esp */
stack -= 4;
return (void *) stack;
}
ok, it won't work, but you have to admit it's kind of slick.
Yes, absolutely. I had such code in my tree before I first submitted the framework, but then I discovered that it wouldn't work always.
Do you want to take one more pass at this? Your ideas were more attuned to hardware reality than mine. I think we're close and we need to get Corey supported on via very soon :-)
I can mail a new version to the list in ~2 hours. If that's OK for you, I'll go ahead.
Regards Carl-Daniel
On Tue, Oct 14, 2008 at 10:57 AM, Carl-Daniel Hailfinger
I can mail a new version to the list in ~2 hours. If that's OK for you, I'll go ahead.
perfect.
ron
Carl-Daniel Hailfinger wrote:
If the stack is > CARBASE, then it is the CARBASE + CARSIZE - 4. If < carbase, it is the RAMBASE + RAMSIZE - 4. Done. But let's make it flexible.
Hm yes, something along these lines. Which leads to an interesting question: Where do we want the stack if we have to move it? Top of memory?
That should be specific to the CPU or platform but it would be good to be below 1MB. 0x1000 is a normal location. This will allow the stack to be shared if/when coreboot goes to real mode (SeaBIOS integration discussion).
Marc
On Tue, Oct 14, 2008 at 12:15 PM, Marc Jones Marc.Jones@amd.com wrote:
That should be specific to the CPU or platform but it would be good to be below 1MB. 0x1000 is a normal location. This will allow the stack to be shared if/when coreboot goes to real mode (SeaBIOS integration discussion).
works for me. But page0 is getting crowded then. maybe we should reserve seg 0 for stack/global variables? Then stack can start at 0xfffc (32-bit aligned at top of stack). Code can start at 0x10000.
I sure wish CAR was just sensible on all machines :-)
ron
On 14.10.2008 21:15, Marc Jones wrote:
Carl-Daniel Hailfinger wrote:
If the stack is > CARBASE, then it is the CARBASE + CARSIZE - 4. If < carbase, it is the RAMBASE + RAMSIZE - 4. Done. But let's make it flexible.
Hm yes, something along these lines. Which leads to an interesting question: Where do we want the stack if we have to move it? Top of memory?
That should be specific to the CPU or platform but it would be good to be below 1MB. 0x1000 is a normal location. This will allow the stack to be shared if/when coreboot goes to real mode (SeaBIOS integration discussion).
Please think of the children^W^W Suspend-to-RAM. We can't clobber arbitrary areas. However, it is entirely possible to back up the new stack location before switching to it. In fact, I plan to write code for that soon.
Oh, and can we rely on APs never coming that far in the code? Switching their stacks as well would be a nightmare.
Regards, Carl-Daniel
Carl-Daniel Hailfinger wrote:
I believe the stage0_main name is misleading. After all, stage0 is pure asm and lives in its own .S file.
let's call it stage1 then and main()
Works for me.
I'm afraid I don't like that.
Please suggest something that makes the timeline obvious. I think we already have other problems like this in v3.
I would be OK with adding phases to stage1 e.g. but I have also contemplated flattening the stage/phase tree to only have stages and no phases - though that doesn't have to happen right now.
//Peter
On Tue, Oct 14, 2008 at 12:15 PM, Peter Stuge peter@stuge.se wrote:
Carl-Daniel Hailfinger wrote:
I believe the stage0_main name is misleading. After all, stage0 is pure asm and lives in its own .S file.
let's call it stage1 then and main()
Works for me.
I'm afraid I don't like that.
Please suggest something that makes the timeline obvious. I think we already have other problems like this in v3.
ok. pick one.
I would be OK with adding phases to stage1 e.g. but I have also contemplated flattening the stage/phase tree to only have stages and no phases - though that doesn't have to happen right now.
oh not that. You don't put all your files in one directory do you :-)
There's nothing that wrong with a two level hierarchy. We even do it in the post codes, I think we can do it via stagex/phasex type concepts
ron
On 14.10.2008 21:15, Peter Stuge wrote:
Carl-Daniel Hailfinger wrote:
I believe the stage0_main name is misleading. After all, stage0 is pure asm and lives in its own .S file.
let's call it stage1 then and main()
Works for me.
I'm afraid I don't like that.
Please suggest something that makes the timeline obvious. I think we already have other problems like this in v3.
I would be OK with adding phases to stage1 e.g. but I have also contemplated flattening the stage/phase tree to only have stages and no phases - though that doesn't have to happen right now.
OK, so can we settle on phases?
old -> new stage1_main before disable_car-> stage1_phase1 disable_car and stack switch -> stage1_phase2 stage1_main after stack switch -> stage1_phase3
And yes, I'm aware that the patch below doesn't change the corresponding asm code yet. I'll work on that as soon as we have agreed on stage1_* naming.
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Index: corebootv3-stackrebuild/arch/x86/stage1.c =================================================================== --- corebootv3-stackrebuild/arch/x86/stage1.c (Revision 925) +++ corebootv3-stackrebuild/arch/x86/stage1.c (Arbeitskopie) @@ -143,6 +143,9 @@ return ret; }
+void stage1_phase2(void); +void __attribute__((stdcall)) stage1_phase3(void); + /** * This function is called from assembler code with its argument on the * stack. Force the compiler to generate always correct code for this case. @@ -154,29 +157,13 @@ * that we are restarting after some sort of reconfiguration. Note that we could use it on geode but * do not at present. */ -void __attribute__((stdcall)) stage1_main(u32 bist, u32 init_detected) +void __attribute__((stdcall)) stage1_phase1(u32 bist, u32 init_detected) { struct global_vars globvars; int ret; struct mem_file archive; - void *entry; struct node_core_id me; -#ifdef CONFIG_PAYLOAD_ELF_LOADER - struct mem_file result; - int elfboot_mem(struct lb_memory *mem, void *where, int size);
- /* Why can't we statically init this hack? */ - unsigned char faker[64]; - struct lb_memory *mem = (struct lb_memory*) faker; - - mem->tag = LB_TAG_MEMORY; - mem->size = 28; - mem->map[0].start.lo = mem->map[0].start.hi = 0; - mem->map[0].size.lo = (32*1024*1024); - mem->map[0].size.hi = 0; - mem->map[0].type = LB_MEM_RAM; -#endif /* CONFIG_PAYLOAD_ELF_LOADER */ - post_code(POST_STAGE1_MAIN);
/* before we do anything, we want to stop if we do not run @@ -234,6 +221,29 @@
printk(BIOS_DEBUG, "Done RAM init code\n");
+ /* Switch the stack location from CAR to RAM, rebuild the stack, + * disable CAR and continue at stage1_phase3(). This is all wrapped in + * stage1_phase2() to make the code easier to follow. + * We will NEVER return. + */ + stage1_phase2(); + + /* If we reach this point, something went terribly wrong. */ + die("The world is broken.\n"); +} + +/** + * This function is called to take care of switching and rebuilding the stack + * so that we can cope with processors which don't support a CAR area at low + * addresses where CAR could be copied to RAM without problems. + * After the stack has been rebuilt, we switch the stack pointer to the new + * location, move the printk buffer and cotinue at stage1_phase3(). + * + * NOTE: switch_stack may need to be reimplemented in processor-specific asm. + * TODO: Reevaluate whether printk_buffer_move should come before disable_car() + */ +void stage1_phase2() +{ /* Turn off Cache-As-Ram */ disable_car();
@@ -241,7 +251,37 @@ /* Move the printk buffer to PRINTK_BUF_ADDR_RAM */ printk_buffer_move((void *)PRINTK_BUF_ADDR_RAM, PRINTK_BUF_SIZE_RAM); #endif + stage1_phase3(); +}
+/** + * This function is the second part of the former stage1_main() after + * switching the stack and disabling CAR. + */ +void __attribute__((stdcall)) stage1_phase3() +{ + void *entry; + int ret; + struct mem_file archive; +#ifdef CONFIG_PAYLOAD_ELF_LOADER + struct mem_file result; + int elfboot_mem(struct lb_memory *mem, void *where, int size); + + /* Why can't we statically init this hack? */ + unsigned char faker[64]; + struct lb_memory *mem = (struct lb_memory*) faker; + + mem->tag = LB_TAG_MEMORY; + mem->size = 28; + mem->map[0].start.lo = mem->map[0].start.hi = 0; + mem->map[0].size.lo = (32*1024*1024); + mem->map[0].size.hi = 0; + mem->map[0].type = LB_MEM_RAM; +#endif /* CONFIG_PAYLOAD_ELF_LOADER */ + + // location and size of image. + init_archive(&archive); + entry = load_file_segments(&archive, "normal/stage2"); if (entry == (void *)-1) die("FATAL: Failed loading stage2.");
On Wed, Oct 15, 2008 at 4:46 AM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
On 14.10.2008 21:15, Peter Stuge wrote:
Carl-Daniel Hailfinger wrote:
I believe the stage0_main name is misleading. After all, stage0 is pure asm and lives in its own .S file.
let's call it stage1 then and main()
Works for me.
I'm afraid I don't like that.
Please suggest something that makes the timeline obvious. I think we already have other problems like this in v3.
I would be OK with adding phases to stage1 e.g. but I have also contemplated flattening the stage/phase tree to only have stages and no phases - though that doesn't have to happen right now.
OK, so can we settle on phases?
old -> new stage1_main before disable_car-> stage1_phase1 disable_car and stack switch -> stage1_phase2 stage1_main after stack switch -> stage1_phase3
sounds good.
And yes, I'm aware that the patch below doesn't change the corresponding asm code yet. I'll work on that as soon as we have agreed on stage1_* naming.
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Index: corebootv3-stackrebuild/arch/x86/stage1.c
--- corebootv3-stackrebuild/arch/x86/stage1.c (Revision 925) +++ corebootv3-stackrebuild/arch/x86/stage1.c (Arbeitskopie) @@ -143,6 +143,9 @@ return ret; }
+void stage1_phase2(void); +void __attribute__((stdcall)) stage1_phase3(void);
/**
- This function is called from assembler code with its argument on the
- stack. Force the compiler to generate always correct code for this case.
@@ -154,29 +157,13 @@
- that we are restarting after some sort of reconfiguration. Note that we could use it on geode but
- do not at present.
*/ -void __attribute__((stdcall)) stage1_main(u32 bist, u32 init_detected) +void __attribute__((stdcall)) stage1_phase1(u32 bist, u32 init_detected) { struct global_vars globvars; int ret; struct mem_file archive;
void *entry; struct node_core_id me;
-#ifdef CONFIG_PAYLOAD_ELF_LOADER
struct mem_file result;
int elfboot_mem(struct lb_memory *mem, void *where, int size);
/* Why can't we statically init this hack? */
unsigned char faker[64];
struct lb_memory *mem = (struct lb_memory*) faker;
mem->tag = LB_TAG_MEMORY;
mem->size = 28;
mem->map[0].start.lo = mem->map[0].start.hi = 0;
mem->map[0].size.lo = (32*1024*1024);
mem->map[0].size.hi = 0;
mem->map[0].type = LB_MEM_RAM;
-#endif /* CONFIG_PAYLOAD_ELF_LOADER */
post_code(POST_STAGE1_MAIN); /* before we do anything, we want to stop if we do not run
@@ -234,6 +221,29 @@
printk(BIOS_DEBUG, "Done RAM init code\n");
/* Switch the stack location from CAR to RAM, rebuild the stack,
* disable CAR and continue at stage1_phase3(). This is all wrapped in
* stage1_phase2() to make the code easier to follow.
* We will NEVER return.
*/
stage1_phase2();
/* If we reach this point, something went terribly wrong. */
die("The world is broken.\n");
+}
+/**
- This function is called to take care of switching and rebuilding the stack
- so that we can cope with processors which don't support a CAR area at low
- addresses where CAR could be copied to RAM without problems.
- After the stack has been rebuilt, we switch the stack pointer to the new
- location, move the printk buffer and cotinue at stage1_phase3().
- NOTE: switch_stack may need to be reimplemented in processor-specific asm.
- TODO: Reevaluate whether printk_buffer_move should come before disable_car()
- */
+void stage1_phase2() +{ /* Turn off Cache-As-Ram */ disable_car();
@@ -241,7 +251,37 @@ /* Move the printk buffer to PRINTK_BUF_ADDR_RAM */ printk_buffer_move((void *)PRINTK_BUF_ADDR_RAM, PRINTK_BUF_SIZE_RAM); #endif
stage1_phase3();
+}
per our earlier discussion, do you really want to try to call more functions in stage1_phase2 after you've changed the stack, given assumptions gcc may have made? Or do you want the printk_buffer_move in stage1_phase3?
Either way, we have to move forward, let's see the assembly, I'll test on geode to make sure there is no breakage, then we can move forward for Corey.
ron
ron
On 15.10.2008 17:31, ron minnich wrote:
On Wed, Oct 15, 2008 at 4:46 AM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
On 14.10.2008 21:15, Peter Stuge wrote:
Carl-Daniel Hailfinger wrote:
I believe the stage0_main name is misleading. After all, stage0 is pure asm and lives in its own .S file.
let's call it stage1 then and main()
Works for me.
I'm afraid I don't like that.
Please suggest something that makes the timeline obvious. I think we already have other problems like this in v3.
I would be OK with adding phases to stage1 e.g. but I have also contemplated flattening the stage/phase tree to only have stages and no phases - though that doesn't have to happen right now.
OK, so can we settle on phases?
old -> new stage1_main before disable_car-> stage1_phase1 disable_car and stack switch -> stage1_phase2 stage1_main after stack switch -> stage1_phase3
sounds good.
And yes, I'm aware that the patch below doesn't change the corresponding asm code yet. I'll work on that as soon as we have agreed on stage1_* naming.
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Index: corebootv3-stackrebuild/arch/x86/stage1.c
--- corebootv3-stackrebuild/arch/x86/stage1.c (Revision 925) +++ corebootv3-stackrebuild/arch/x86/stage1.c (Arbeitskopie) @@ -143,6 +143,9 @@ return ret; }
+void stage1_phase2(void); +void __attribute__((stdcall)) stage1_phase3(void);
/**
- This function is called from assembler code with its argument on the
- stack. Force the compiler to generate always correct code for this case.
@@ -154,29 +157,13 @@
- that we are restarting after some sort of reconfiguration. Note that we could use it on geode but
- do not at present.
*/ -void __attribute__((stdcall)) stage1_main(u32 bist, u32 init_detected) +void __attribute__((stdcall)) stage1_phase1(u32 bist, u32 init_detected) { struct global_vars globvars; int ret; struct mem_file archive;
void *entry; struct node_core_id me;
-#ifdef CONFIG_PAYLOAD_ELF_LOADER
struct mem_file result;
int elfboot_mem(struct lb_memory *mem, void *where, int size);
/* Why can't we statically init this hack? */
unsigned char faker[64];
struct lb_memory *mem = (struct lb_memory*) faker;
mem->tag = LB_TAG_MEMORY;
mem->size = 28;
mem->map[0].start.lo = mem->map[0].start.hi = 0;
mem->map[0].size.lo = (32*1024*1024);
mem->map[0].size.hi = 0;
mem->map[0].type = LB_MEM_RAM;
-#endif /* CONFIG_PAYLOAD_ELF_LOADER */
post_code(POST_STAGE1_MAIN); /* before we do anything, we want to stop if we do not run
@@ -234,6 +221,29 @@
printk(BIOS_DEBUG, "Done RAM init code\n");
/* Switch the stack location from CAR to RAM, rebuild the stack,
* disable CAR and continue at stage1_phase3(). This is all wrapped in
* stage1_phase2() to make the code easier to follow.
* We will NEVER return.
*/
stage1_phase2();
/* If we reach this point, something went terribly wrong. */
die("The world is broken.\n");
+}
+/**
- This function is called to take care of switching and rebuilding the stack
- so that we can cope with processors which don't support a CAR area at low
- addresses where CAR could be copied to RAM without problems.
- After the stack has been rebuilt, we switch the stack pointer to the new
- location, move the printk buffer and cotinue at stage1_phase3().
- NOTE: switch_stack may need to be reimplemented in processor-specific asm.
- TODO: Reevaluate whether printk_buffer_move should come before disable_car()
- */
+void stage1_phase2() +{ /* Turn off Cache-As-Ram */ disable_car();
@@ -241,7 +251,37 @@ /* Move the printk buffer to PRINTK_BUF_ADDR_RAM */ printk_buffer_move((void *)PRINTK_BUF_ADDR_RAM, PRINTK_BUF_SIZE_RAM); #endif
stage1_phase3();
+}
per our earlier discussion, do you really want to try to call more functions in stage1_phase2 after you've changed the stack, given assumptions gcc may have made?
Absolutely not. You're right about this. However, right now my focus is on keeping changes isolated and I'd take care of this in a separate patch.
Or do you want the printk_buffer_move in stage1_phase3?
I'd prefer to have printk_buffer_move as first instruction in stage1_phase2.
Either way, we have to move forward, let's see the assembly, I'll test on geode to make sure there is no breakage, then we can move forward for Corey.
I'll be back in ~3 hours. If you have acked this in the meanwhile, I will commit, otherwise I'll rework.
Regards, Carl-Daniel
On Wed, Oct 15, 2008 at 10:04 AM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
I'll be back in ~3 hours. If you have acked this in the meanwhile, I will commit, otherwise I'll rework.
let's get a complete patch set, then I will test on geode and simnow tonight. This all looks fine, but it makes me nervous.
ron
On 15.10.2008 19:06, ron minnich wrote:
On Wed, Oct 15, 2008 at 10:04 AM, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
I'll be back in ~3 hours. If you have acked this in the meanwhile, I will commit, otherwise I'll rework.
let's get a complete patch set, then I will test on geode and simnow tonight. This all looks fine, but it makes me nervous.
Sure. Final version with lots and lots of comments follows. Test booted in Qemu. The patch even fixes up the documentation!
Notes about this patch: - Code flow is almost unchanged for Qemu, K8 and Geode. No extensive new testing required. - We can support stack-keeping and stack-relocating architectures at the same time, so C7 is definitely supportable - The comment in stage1_phase2 says "some of this is not yet done". That refers to the nonexisting code for stack switching on C7. - "Minimal changes, maximum benefit".
Signed-off-by: Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net
Index: corebootv3-stackrebuild/include/arch/x86/stage1.h =================================================================== --- corebootv3-stackrebuild/include/arch/x86/stage1.h (Revision 0) +++ corebootv3-stackrebuild/include/arch/x86/stage1.h (Revision 0) @@ -0,0 +1,2 @@ +void stage1_phase2(void); +void __attribute__((stdcall)) stage1_phase3(void); Index: corebootv3-stackrebuild/mainboard/emulation/qemu-x86/stage1.c =================================================================== --- corebootv3-stackrebuild/mainboard/emulation/qemu-x86/stage1.c (Revision 931) +++ corebootv3-stackrebuild/mainboard/emulation/qemu-x86/stage1.c (Arbeitskopie) @@ -17,6 +17,8 @@ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */
+#include <stage1.h> + /* printk() will not yet output anything. */
/** @@ -34,6 +36,7 @@
void disable_car(void) { + stage1_phase3(); }
Index: corebootv3-stackrebuild/doc/design/newboot.lyx =================================================================== --- corebootv3-stackrebuild/doc/design/newboot.lyx (Revision 931) +++ corebootv3-stackrebuild/doc/design/newboot.lyx (Arbeitskopie) @@ -759,7 +759,7 @@ \end_layout
\begin_layout Standard -Initial entry point for stage 1 is arch/{architecture}/stage1.c:stage1_main(). +Initial entry point for stage 1 is arch/{architecture}/stage1.c:stage1_phase1(). \end_layout
\begin_layout Standard Index: corebootv3-stackrebuild/arch/x86/via/stage0.S =================================================================== --- corebootv3-stackrebuild/arch/x86/via/stage0.S (Revision 931) +++ corebootv3-stackrebuild/arch/x86/via/stage0.S (Arbeitskopie) @@ -189,7 +189,7 @@ pushl $0 /* First parameter: bist */ pushl %eax - call stage1_main + call stage1_phase1 /* We will not go back. */
fixed_mtrr_msr: Index: corebootv3-stackrebuild/arch/x86/geodelx/stage0.S =================================================================== --- corebootv3-stackrebuild/arch/x86/geodelx/stage0.S (Revision 931) +++ corebootv3-stackrebuild/arch/x86/geodelx/stage0.S (Arbeitskopie) @@ -271,7 +271,7 @@ pushl $0 /* First parameter: bist */ pushl %eax - call stage1_main + call stage1_phase1 /* We will not go back. */
#include "../stage0_common.S" Index: corebootv3-stackrebuild/arch/x86/geodelx/stage1.c =================================================================== --- corebootv3-stackrebuild/arch/x86/geodelx/stage1.c (Revision 931) +++ corebootv3-stackrebuild/arch/x86/geodelx/stage1.c (Arbeitskopie) @@ -24,6 +24,7 @@ #include <amd_geodelx.h> #include <console.h> #include <msr.h> +#include <stage1.h>
static const struct msrinit msr_table[] = { /* Setup access to cache under 1MB. */ @@ -93,4 +94,5 @@ banner(BIOS_DEBUG, "Disable_car: done wbinvd"); northbridge_init_early(); banner(BIOS_DEBUG, "disable_car: done"); + stage1_phase3(); } Index: corebootv3-stackrebuild/arch/x86/i586/stage0.S =================================================================== --- corebootv3-stackrebuild/arch/x86/i586/stage0.S (Revision 931) +++ corebootv3-stackrebuild/arch/x86/i586/stage0.S (Arbeitskopie) @@ -339,7 +339,7 @@ pushl $0 /* First parameter: bist */ pushl %eax - call stage1_main + call stage1_phase1 /* We will not go back. */
fixed_mtrr_msr: Index: corebootv3-stackrebuild/arch/x86/amd/k8/stage1.c =================================================================== --- corebootv3-stackrebuild/arch/x86/amd/k8/stage1.c (Revision 931) +++ corebootv3-stackrebuild/arch/x86/amd/k8/stage1.c (Arbeitskopie) @@ -26,6 +26,7 @@ #include <macros.h> #include <cpu.h> #include <amd/k8/k8.h> +#include <stage1.h>
/** * Set the MTRR for initial ram access. @@ -73,4 +74,5 @@ /* we're now running in ram. Although this will be called again, it does no harm to call it here. */ set_init_ram_access(); banner(BIOS_DEBUG, "disable_car: done"); + stage1_phase3(); } Index: corebootv3-stackrebuild/arch/x86/amd/stage0.S =================================================================== --- corebootv3-stackrebuild/arch/x86/amd/stage0.S (Revision 931) +++ corebootv3-stackrebuild/arch/x86/amd/stage0.S (Arbeitskopie) @@ -341,7 +341,7 @@ pushl %ebx /* First parameter: bist */ pushl %eax - call stage1_main + call stage1_phase1 /* We will not go back. */
movb $0xAF, %al /* Should never see this postcode */ Index: corebootv3-stackrebuild/arch/x86/stage1.c =================================================================== --- corebootv3-stackrebuild/arch/x86/stage1.c (Revision 931) +++ corebootv3-stackrebuild/arch/x86/stage1.c (Arbeitskopie) @@ -30,6 +30,7 @@ #include <mc146818rtc.h> #include <cpu.h> #include <multiboot.h> +#include <stage1.h>
#ifdef CONFIG_PAYLOAD_ELF_LOADER /* ah, well, what a mess! This is a hard code. FIX ME but how? @@ -154,29 +155,13 @@ * that we are restarting after some sort of reconfiguration. Note that we could use it on geode but * do not at present. */ -void __attribute__((stdcall)) stage1_main(u32 bist, u32 init_detected) +void __attribute__((stdcall)) stage1_phase1(u32 bist, u32 init_detected) { struct global_vars globvars; int ret; struct mem_file archive; - void *entry; struct node_core_id me; -#ifdef CONFIG_PAYLOAD_ELF_LOADER - struct mem_file result; - int elfboot_mem(struct lb_memory *mem, void *where, int size);
- /* Why can't we statically init this hack? */ - unsigned char faker[64]; - struct lb_memory *mem = (struct lb_memory*) faker; - - mem->tag = LB_TAG_MEMORY; - mem->size = 28; - mem->map[0].start.lo = mem->map[0].start.hi = 0; - mem->map[0].size.lo = (32*1024*1024); - mem->map[0].size.hi = 0; - mem->map[0].type = LB_MEM_RAM; -#endif /* CONFIG_PAYLOAD_ELF_LOADER */ - post_code(POST_STAGE1_MAIN);
/* before we do anything, we want to stop if we do not run @@ -234,14 +219,77 @@
printk(BIOS_DEBUG, "Done RAM init code\n");
- /* Turn off Cache-As-Ram */ - disable_car(); + /* Switch the stack location from CAR to RAM, rebuild the stack, + * disable CAR and continue at stage1_phase3(). This is all wrapped in + * stage1_phase2() to make the code easier to follow. + * We will NEVER return. + */ + stage1_phase2();
+ /* If we reach this point, something went terribly wrong. */ + die("The world is broken.\n"); +} + +/** + * This function is called to take care of switching and rebuilding the stack + * so that we can cope with processors which don't support a CAR area at low + * addresses where CAR could be copied to RAM without problems. + * This function handles everything related to switching off CAR and moving + * important data from CAR to RAM. + * 1. Perform all work which can be done while CAR and RAM are both active. + * That's mainly moving the printk buffer around. + * 2a. Optionally back up the new stack location (desirable for S3). + * 2b. Optionally rebuild the stack at another location. + * 2c. Switch stack pointer to the new stack if the stack was rebuilt. + * 3. Disable CAR. + * 4. Call or jump to stage1_phase3. + * Steps 2a-4 have to be done in asm. That's what the oddly named disable_car() + * function does. + * + * TODO: Some parts of the list above are not yet done, so the code will not + * yet work on C7. + */ +void stage1_phase2() +{ #ifdef CONFIG_CONSOLE_BUFFER /* Move the printk buffer to PRINTK_BUF_ADDR_RAM */ printk_buffer_move((void *)PRINTK_BUF_ADDR_RAM, PRINTK_BUF_SIZE_RAM); #endif + /* Turn off Cache-As-Ram */ + disable_car();
+ /* If we reach this point, something went terribly wrong. */ + die("The world is broken.\n"); +} + +/** + * This function is the second part of the former stage1_main() after + * switching the stack and disabling CAR. + */ +void __attribute__((stdcall)) stage1_phase3() +{ + void *entry; + int ret; + struct mem_file archive; +#ifdef CONFIG_PAYLOAD_ELF_LOADER + struct mem_file result; + int elfboot_mem(struct lb_memory *mem, void *where, int size); + + /* Why can't we statically init this hack? */ + unsigned char faker[64]; + struct lb_memory *mem = (struct lb_memory*) faker; + + mem->tag = LB_TAG_MEMORY; + mem->size = 28; + mem->map[0].start.lo = mem->map[0].start.hi = 0; + mem->map[0].size.lo = (32*1024*1024); + mem->map[0].size.hi = 0; + mem->map[0].type = LB_MEM_RAM; +#endif /* CONFIG_PAYLOAD_ELF_LOADER */ + + // location and size of image. + init_archive(&archive); + entry = load_file_segments(&archive, "normal/stage2"); if (entry == (void *)-1) die("FATAL: Failed loading stage2.");
you can take this ack for this and the other patch
Acked-by: Ronald G. Minnich rminnich@gmail.com
On 16.10.2008 04:28, ron minnich wrote:
you can take this ack for this and the other patch
Acked-by: Ronald G. Minnich rminnich@gmail.com
Thanks, r932.
Regards, Carl-Daniel
On Wed, Oct 15, 2008 at 01:46:49PM +0200, Carl-Daniel Hailfinger wrote:
OK, so can we settle on phases?
old -> new stage1_main before disable_car-> stage1_phase1 disable_car and stack switch -> stage1_phase2 stage1_main after stack switch -> stage1_phase3
Hi Carl-Daniel,
I think breaking stage1 up into multiple parts makes a lot of sense. Thanks for working on this.
-Kevin