Arthur Heymans has uploaded this change for review.

View Change

Documentation: Add cache as ram documentation

This add documentation around coreboot's history of CAR usage.
The history is reconstructed from articles and git/svn history,
but should provide a good overview of the challenges and solutions
coreboot came up with to deal with early code execution on X86

The technical details about CAr will come in follow-up patch and are
marked as TODO for now.

Change-Id: I964d59a6bd210f3d4d06220d2ec7166110ab2cf3
Signed-off-by: Arthur Heymans <>
A Documentation/cpu/
A Documentation/cpu/x86/
M Documentation/
3 files changed, 171 insertions(+), 0 deletions(-)

git pull ssh:// refs/changes/26/36126/1
diff --git a/Documentation/cpu/ b/Documentation/cpu/
new file mode 100644
index 0000000..d9f063f
--- /dev/null
+++ b/Documentation/cpu/
@@ -0,0 +1,5 @@
+# CPU-specific documentation
+This section contains documentation about coreboot on specific CPU.
+* [A gentle introduction to Cache-as-Ram on X86](x86/
diff --git a/Documentation/cpu/x86/ b/Documentation/cpu/x86/
new file mode 100644
index 0000000..3d2983e
--- /dev/null
+++ b/Documentation/cpu/x86/
@@ -0,0 +1,165 @@
+# A gentle introduction to Cache-as-Ram on X86
+This explains a bit of history on CAR in coreboot, how it works
+and then finally give a more in depth overview at the steps of
+how it is set up on Intel CPUs.
+## Glossary
+- Cache-as-RAM/CAR: Using not memory mapped CPU cache as execution
+ environment.
+- XIP: Execute in place on a memory mapped medium
+- SRAM: Static Random Access Memory, here: memory mapped memory
+ that needs no initialization.
+- ROMCC: A C compiler using only the CPU registers.
+## Cache-as-Ram, the basics
+When an X86 platforms boot, it starts in a pretty bare environment.
+The first instruction is fetched from the top of the boot medium,
+which on most x86 platforms is memory mapped below 4G (in 32bit
+protected mode). Not only do they start in 16bit 'real' mode,
+most of the hardware starts uninitialized. This includes the dram
+controller. Another thing about X86 platforms is that they generally
+don't feature SRAM, static RAM. SRAM for this purpose of this text is
+a small region of ram that is available without any initialization.
+Without RAM this leaves the system with just CPU registers and a
+memory mapped boot medium of which code can be executed in place (XIP)
+to initialize the hardware.
+For simple dram controllers that environment is just fine and
+initializing the dram controller in assembly code without using a
+stack is very doable. Modern systems however require complex
+initialization procedures and writing that in assembly would be either
+too hard or simply impossible. To overcome this limitation a technique
+called Cache-as-Ram exists. It allows to use the CPU's Cache-as-Ram.
+This in turn makes it possible to write the early initialization code
+in a higher level programming language compiled by a regular compiler,
+e.g. GCC. Typically only the stack is placed in Cache-as-Ram and the
+code is still executed in place, but copying code in the cache as ram
+and executing from there is also possible.
+For coreboot this means that with only a very small amount of assembly
+code to set up CAR, all the other code can written in C, which
+compared to writing assembly tremendously increases development speed
+and is much less error prone.
+## A bit of history on Cache-as-Ram in coreboot
+### Linux as a BIOS? The invention of romstage
+When LinuxBIOS, later coreboot, started out, the first naive attempt
+to run the Linux kernel instead of the legacy BIOS interface, was to
+jump straight to loading the kernel. This did not work at all, since
+dram was not working at that point. LinuxBIOS v1 therefore
+initialized the memory in assembly before it was able to load the
+kernel. This executed in place and would later become romstage. It
+turned that Linux needed other devices, like enumerated and have it's
+resources allocated, so yet another stage, ramstage, was called into
+life, but that's another story.
+### coreboot v2: romcc
+In 2003 AMD's K8 CPU's came out. Those CPU's featured an on die memory
+controller and also a new point to point BUS for CPU <-> CPU and
+CPU <-> IO controller (northbridge) devices. Writing support for
+these in assembly became difficult and a better solution was needed.
+This is where ROMCC came to the rescue in 2002. ROMCC is a 25K lines
+of code C compiler, written by Eric Biederman, that does not use a
+stack but uses CPU registers instead. Besides the 8 general purpose
+registers on x86 the MMX and SSE registers, respectively _%mm0_-_%mm7_
+and _%xmm0_-_%xmm7_, can be used to store data. ROMCC also converts
+all function calls to static inline functions. This has the
+disadvantage that a ROMCC compiled stage must be compiled in one go,
+linking is not possible. In practice you will see a lot of '#include
+_some_file.c_' which is quite unusual in C, because it makes the
+inclusion of files fragile. Given that limited amount
+'register-stack', it also imposes restrictions on how many levels of
+nested function can be achieved. One last issue with ROMCC is that
+no-one really dared touching that code, due to its size and complexity
+besides its original author.
+Coreboot used ROMCC in the following way: a bootblock written in both
+assembly to enter 32bit protected mode would use ROMCC compiled code
+to either jump to 'normal/romstage' or 'fallback/romstage' which in
+turn was also compiled with romstage.
+### Cache-as-Ram
+While ROMCC was eventually used for early initialization, prior to
+that development attempts were made to use the CPU's cache as RAM.
+This would allow to use regular GCC compiled and linked code to
+be used. The idea is to set up the CPU's cache to Write Back to a
+region, while disable cacheline filling or simply hope that no cache
+eviction or invalidation happens. Eric Biederman initially got CAR
+working, but with the advent of Intel Hyperthreading on the Pentium 4,
+on which CAR setup proved to be a little more difficult and he
+developed ROMCC instead.
+Around 2006 a lot was learned from coreboot v2 with its ROMCC usage
+and development on coreboot v3 was started. The decision was mode to
+not use ROMCC anymore and use Cache Memory for early stages. The
+results were good as init object code size is reduced by factor of
+four. So the code was smaller and faster at the same time.
+coreboot v4 sort of merged v2 and v3, so it was a mix of ROMCC compiled
+romstage and romstage with CAR.
+In 2014 Support for ROMCC compiled romstage was dropped.
+### Intel Apollolake, a new era: POSTCAR_STAGE and C_ENVIRONMENT_BOOTBLOCK
+In 2016 Intel released the ApolloLake architecture. This architecture
+is a weird duck in the X86 pool. It is the first Intel CPU to feature
+MMC as a boot medium and does not memory map it. It also features to
+the main CPU read only SRAM that is mapped right below 4G. The
+bootblock is copied to SRAM and executed. Given that the boot medium
+is not always memory mapped XIP is not an option. The solution is to
+set up CAR in the bootblock and copy the romstage in CAR. We call this
+C_ENVIRONMENT_BOOTBLOCK, because it runs GCC compiled code. Granted
+XIP is still possible on ApolloLake, but coreboot has to use Intel's
+blob, FSP-M, and it is linked to run in CAR, so a blob actually forced
+a nice feature in coreboot!
+Another issue arises with this setup. With romstage running from the
+read only boot medium, you can continue executing code (albeit without
+stack) to tear down the CAR and start executing code from 'real' RAM.
+On ApolloLake with romstage executing from CAR, tearing down CAR in
+there, would shooting in ones own foot, as our code would be gone
+in doing so. The solution was to tear down CAR in a separate stage,
+named 'postcar' stage. This provides a clean separation of programs
+and results in needing less linker scripts hacks in romstage. This
+solution was therefore adopted by many other platforms that still did
+### AMD Zen, 2019
+On AMD Zen the first CPU to come out of reset is the PSP and it
+initializes the dram before pulling out the main CPU out of reset.
+CAR won't be needed, nor used on this platform.
+### The future?
+For coreboot release 4.11, scheduled for october 2019, support for
+ROMCC in the bootblock will be dropped as will support for the messy
+tearing down of CAR in romstage as opposed to doing that in a separate
+## In depth analysis
+### CAR setup
+TODO: MTRR, IPI/hyperthreaded, ROM Caching, Non-evict
+### Linker symbols
+TODO: sharing things over multiple CAR stages: car.ld
+### CAR teardown
+## References
+[2007, The Open Source BIOS is Ten An interview with the coreboot developers](
+[A Framework for Using Processor Cache-as-Ram (CAR)](
+[CAR: Using Cache-as-Ram in LinuxBIOS](
diff --git a/Documentation/ b/Documentation/
index 1c04ad3..85666cb 100644
--- a/Documentation/
+++ b/Documentation/
@@ -177,6 +177,7 @@
* [Display panel-specific documentation](gfx/
* [Architecture-specific documentation](arch/
* [Platform independend drivers documentation](drivers/
+* [CPU-specific documentation](cpu/
* [Northbridge-specific documentation](northbridge/
* [System on Chip-specific documentation](soc/
* [Mainboard-specific documentation](mainboard/

To view, visit change 36126. To unsubscribe, or for help writing mail filters, visit settings.

Gerrit-Project: coreboot
Gerrit-Branch: master
Gerrit-Change-Id: I964d59a6bd210f3d4d06220d2ec7166110ab2cf3
Gerrit-Change-Number: 36126
Gerrit-PatchSet: 1
Gerrit-Owner: Arthur Heymans <>
Gerrit-MessageType: newchange