Hi guys,
I'm still hard at work over the venerable (even "almighty" at the time) 440BX and Slot 1 boards.
[1] https://review.coreboot.org/c/20977/
And now I'm stuck and thoroughly confused.
My current state is: 1. cbmem_initialize_empty() failed to even start allocating the root CBMEM entry. No indication why. I tried tracing the code path in the sources and still could not find out where exactly it failed. With enough fiddling I did get it to complain the way Aaron expected [1]. 2. Using the common Intel CPU cache_as_ram.inc, I can get through mainboard romstage and memory init. If I just return a fixed CONFIG_TOPMEM in setup_stack_and_mtrrs() like what was done in cpu/intel/nehalem, it got past the point of POST_PREPARE_RAMSTAGE and then nothing. 3. Porting the postcar frame assembly from cpu/intel/car/cache_as_ram_ht.inc results in a failure somewhere before loading ramstage and after 4. If I try to run this build under QEMU, it fails with "Trying to execute code outside RAM or ROM at 0x000a0000" in 440BX RAM init code after dumping the "before" northbridge config, so I can't correctly debug it this way either.
I think there's still not clear enough description on how this dynamic CBMEM allocation works. Below is my take and I'll stand readily to be corrected:
1. Reset vector -> protected mode -> southbridge bootblock init for ROM -> bootblock walks CBFS to find romstage. 2. CAR setup in cpu/intel/car/cache_as_ram.inc 3. romstage_main() -> sets up a stack guard -> mainboard_romstage_entry() -> check for overunning of said stack guard 4. romstage calls cbmem_initialize_empty() 5. Somewhere along the cbmem_initialize() path cbmem_top() is called to find out where top of RAM is. It is supposed to allocate memory so that it grows down from here. 6. setup_stack_and_mtrrs() -> allocates a 20KB postcar stack on CBMEM and adds MTRR settings as appropriate. These gets pushed onto the postcar stack in order of popping:
maximum number of supported variable MTRRs <- bottom of stack number of actual used MTRRs MTRR 0 base, low 32 bits MTRR 0 base, high 32 bits MTRR 0 mask, low 32 bits MTRR 0 mask, high 32 bits ...
And this stack should be in RAM, supposedly in an area allocated within CBMEM.
7. This postcar stack pointer is returned via romstage_main(). It gets loaded into %esp, effectively switching the code into a RAM stack with all the MTRRs info on it. 8. Tear down CAR. 9. Wipes all the MTRRs. 10. MTRR values are popped off the stack and loaded into the variable MTRRs. 11. Now the CPU should be at the top of the allocated RAM stack and can move on to load the ramstage.
The sources say for step 5 "the number of entries depends on the size between down-aligned (by 4k) version and the actual top of memory. I thought cbmem_top() should return the actual bytes of available memory and is already aligned and I'd end up with room for zero entries? And if I just return the actual top of memory cbmem_top() for use a the romstage stack, would it not overwrite the cbmem area?
Also how much room is there to consolidate the CAR setup and teardown code and/or move more of it into C?
All the Slot 1 CPUs are in the P6 family and have 16KB L1 data cache that is 4-way set associative. Can I use a full 16K range for CAR or I can only use 4K (16K/4) of it? cache_as_ram.inc use a fixed MTRR that tops out at address 0xD0000 while cache_as_ram_ht.inc uses variable MTRR 0 for this range. Variable MTRR 1 is used in both to cache the ROM. But when I tried to port the latter approach over the code dies doing it. Some Slot 1 CPUs have a very complicated L2 enable code which is better left where it is so I'd like to limit myself to L1 cache at this stage.
Sorry if I am not making sense, because this is not making sense to me either, so any help is appreciated.