Need help implementing romstage CBMEM - coreboot

30 Aug 2017


      Hi guys,
I'm still hard at work over the venerable (even "almighty" at the
time) 440BX and Slot 1 boards.
[1] https://review.coreboot.org/c/20977/
And now I'm stuck and thoroughly confused.
My current state is:
1. cbmem_initialize_empty() failed to even start allocating the root
CBMEM entry. No indication why. I tried tracing the code path in the
sources and still could not find out where exactly it failed. With
enough fiddling I did get it to complain the way Aaron expected [1].
2. Using the common Intel CPU cache_as_ram.inc, I can get through
mainboard romstage and memory init. If I just return a fixed
CONFIG_TOPMEM in setup_stack_and_mtrrs() like what was done in
cpu/intel/nehalem, it got past the point of POST_PREPARE_RAMSTAGE and
then nothing.
3. Porting the postcar frame assembly from
cpu/intel/car/cache_as_ram_ht.inc results in a failure somewhere
before loading ramstage and after
4. If I try to run this build under QEMU, it fails with "Trying to
execute code outside RAM or ROM at 0x000a0000" in 440BX RAM init code
after dumping the "before" northbridge config, so I can't correctly
debug it this way either.
I think there's still not clear enough description on how this dynamic
CBMEM allocation works. Below is my take and I'll stand readily to be
corrected:
1. Reset vector -> protected mode -> southbridge bootblock init for
ROM -> bootblock walks CBFS to find romstage.
2. CAR setup in cpu/intel/car/cache_as_ram.inc
3. romstage_main() -> sets up a stack guard ->
mainboard_romstage_entry() -> check for overunning of said stack guard
4. romstage calls cbmem_initialize_empty()
5. Somewhere along the cbmem_initialize() path cbmem_top() is called
to find out where top of RAM is. It is supposed to allocate memory so
that it grows down from here.
6. setup_stack_and_mtrrs() -> allocates a 20KB postcar stack on CBMEM
and adds MTRR settings as appropriate. These gets pushed onto the
postcar stack in order of popping:
maximum number of supported variable MTRRs <- bottom of stack
number of actual used MTRRs
MTRR 0 base, low 32 bits
MTRR 0 base, high 32 bits
MTRR 0 mask, low 32 bits
MTRR 0 mask, high 32 bits
...
And this stack should be in RAM, supposedly in an area allocated within CBMEM.
7. This postcar stack pointer is returned via romstage_main(). It gets
loaded into %esp, effectively switching the code into a RAM stack with
all the MTRRs info on it.
8. Tear down CAR.
9. Wipes all the MTRRs.
10. MTRR values are popped off the stack and loaded into the variable MTRRs.
11. Now the CPU should be at the top of the allocated RAM stack and
can move on to load the ramstage.
The sources say for step 5 "the number of entries depends on the size
between down-aligned (by 4k) version and the actual top of memory. I
thought cbmem_top() should return the actual bytes of available memory
and is already aligned and I'd end up with room for zero entries? And
if I just return the actual top of memory cbmem_top() for use a the
romstage stack, would it not overwrite the cbmem area?
Also how much room is there to consolidate the CAR setup and teardown
code and/or move more of it into C?
All the Slot 1 CPUs are in the P6 family and have 16KB L1 data cache
that is 4-way set associative. Can I use a full 16K range for CAR or I
can only use 4K (16K/4) of it? cache_as_ram.inc use a fixed MTRR that
tops out at address 0xD0000 while cache_as_ram_ht.inc uses variable
MTRR 0 for this range. Variable MTRR 1 is used in both to cache the
ROM. But when I tried to port the latter approach over the code dies
doing it. Some Slot 1 CPUs have a very complicated L2 enable code
which is better left where it is so I'd like to limit myself to L1
cache at this stage.
Sorry if I am not making sense, because this is not making sense to me
either, so any help is appreciated.