Hi Kyösti,
i understand the problems you are trying to solve, but I hesitate to add stages as it makes it that much more confusing and harder to maintain. I have some specific comments inline.
On Wed, Apr 4, 2012 at 2:54 AM, Kyösti Mälkki kyosti.malkki@gmail.com wrote:
Hi!
Looking at some of the changes proposed with the new support of Intel Sandybridge and Ivybridge, combined with my previous design choices made with support of Intel Hyper-Threading for NetBurst architectures and SMP generally, made me share my thoughts of the Coreboot stage -layout.
So there is currently bootblock, romstage, ramstage and payload, in that specific order. I have identified a few issues that would need to be worked on.
- Built-in-self test failures
On (Intel) SMP system only BSP CPU failure is detected and possibly reported. I think architecture allows that BSP CPU is not the same physical core across power-cycles. One should consistently either be redundant or die on single CPU failure.
- Serial console
This is initialized in romstage and requires working cache to work. If due to a BIST failure or bad cache-as-ram init code, cache fails to work, there is no console.
- Microcode updates
The "tiny" bootblock doesn't seem like the correct place for microcode updates.
Does microcode have to be this early? Before CAR?
- Cache coherency
MTRR setup should be consistent across all CPUs. If all CPUs are started for microcode updates before ramstage, they should fix their MTRRs too. Even then, pre-ram spinlocks may be impossible to implement, so pre-ram SMP operation is very, very restricted.
Yes, this has been a problem in the past, but seems unrelated to the stages. It is ust an implementation issue.
- XIP alignment
If 4 variable MTRRs were used in pre-ram execution environment for XIP, there would be no alignment requirement on the placement of XIP romstage in Flash ROM. Such runtime MTRR setup code is around 512 bytes and cache footprint would extend at most 30% over the actual romstage size. A single MTRR setup may reserve almost twice the actual size of a romstage in both flash and cache memory.
Is this really a problem? Also, The XIP area is limited to avoid using too much cache on some processors. Some code may be executed out of flash, but not be in the XIP area to avoid CAR corruption.
- Bypassing raminit
One may want to start his Coreboot conversion job from something less complex than raminit, like setting up PCI device tree. With the amount of cache on modern CPUs, one could probably run libpayload -apps from cache. One such a nice app would be zmodem download of raminit.
This is an interesting thought, but really a debug/development feature. Seems that calling raminit vs zmodem function in romstage or the bootblock loading a different romstage. I don't see how this changes the stages.
- CPU max physical address
MTRR physical mask should be set correctly for the time of romstage too, just in case memory over 4GB is tested. Should first auto-detect and then provide work-arounds for CPU errata.
Tthere are CPU specific AMD mtrr setup functions to support of the hoisted memory and cache properties of that range. It would be good if this could be combined in a nice way, but I would nack a file full of workarounds for different CPUs types.
Counting all the above together, I would like to start some discussion whether the current 4 stage model is the best design choice. I am thinking about some changes in the layout as a fix:
- Bootblock
No real change. Must guarantee access to all of Flash ROM and operational PCI configuration cycles for following stages. Contains boot vector for any AP CPUs. Exits in protected mode to Stageloader.
- Stageloader
A new stage. This has a pre-CAR environment (ROMCC-build) to enable early serial console and control MTRR setup to enable cache-as-ram. Pre-CAR environment can execute stages from Flash ROM with XIP.
I am not convinced of the value of earlier serial console. It adds complexity where things should be simple. I hesitate to continue moving more code before CAR. We have had a long standing goal to reduce code before CAR.
This also has a CAR and RAM environment (GCC-build) that can execute XIP stages from Flash ROM or decompress stages to CAR/RAM from Flash ROM.
At least with AMD processors, you can't execute code out of the data cache. The only way to get instructions cached is with a code fetch, which doesn't hit the data cache.
I see the reason for executing something other than the normal boot process, but should it be a completely new stage? Why not have the bootblock load a different romstage if you want different romstage behavior?
- CPU init
A new stage built with ROMCC. Checks BIST of AP CPUs, executes microcode updates and handles the issue of shared Cache-Disable bit on hyper-threading Intel CPUs.
Why is there a new stage built with ROMCC after CAR? Again, we want to reduce the code built with ROMCC. It is very fragile, doesn't optimize and link with gcc and we don't get symbols for symbolic debug.
- RAM init
Old romstage built with GCC. Returns to Stageloader after DRAM is functional, but before any DRAM is written.
Why make a new stage? Why return to the previous stage? This also becomes a XIP CAR nightmare.
- DEV init
Old ramstage built with GCC. Only change is that microcode update and SMP setup is already taken care of.
For some CPUs it makes sense to do microcode and SMP setup here (although much less lately). This is still a place to SMP setup, as MTRRs and other settings need to be configured for normal operation.
- Payload
No changes required.
I would be interested in working on some of these topics and I think I can also test most of the suggested changes on older SMP hardware.
IMO, more stages is more confusing and I don't see the benefit. I think that reworking what is going on in some CPU/mainboards bootblock/romstage/ramstage is worth fixing, but I think that the adding more stages, cbfs searching and loading , etc is the wrong direction for coreboot. We should be focused on simple, lean, and concise. Something else to consider is how vendorcode is starting to change coreboot. I would like to see how the new Intel code is added. There is a lot of assumed configuration ownership in the vendor code, and coreboot may need to adjust to that. Something else to consider as you design these stages.
Marc
Thanks,
Kyösti Mälkki kyosti.malkki@gmail.com
-- coreboot mailing list: coreboot@coreboot.org http://www.coreboot.org/mailman/listinfo/coreboot