Where do we go from here?
As I said (and I'll repeat, many times, if required - I do NOT care what all INTEL [all their 13000+ managers] think):
*I have another idea for INTEL SoCs/CPUs, as HW architecture improvement. Why your top-notch HW guys do NOT implement MRC as part of MCU. Some HW thread inside CPU/SoC should execute MCU, shouldn't it? MRCs should be few K in size, and they can perfectly fit in there, thus MRC should be (my take on this) part of internal CPU architecture.*
*Today's INTEL COREs and ATOMs have at least/minimum 100M gates, why not to add couple of dozen K more? Lot of problems solved, don't they? ;-)* *[1] BOOT stage to be much shorter (no anything such as CAR phase);* *[2] ROM stage does not exist;* *[3] IP preserved in HW, so the whole INTEL FSP is actually (imagine the Beauty) Open Source...*
With INTEL. Here we go! Where no one has gone before! ;-)
Zoran
On Tue, Feb 14, 2017 at 6:56 PM, ron minnich rminnich@gmail.com wrote:
Just a reminder about times past. This discussion has been ongoing since 2000. In my view the questions come down to how much the ramstage does, how that impacts code complexity and performance, and when the ramstage gets so much capability that it ought to be a kernel.
In the earliest iteration, there was no ramstage per se. What we now call the ramstage was a Linux kernel.
We had lots of discussions in the early days with LNXI and others about what would boot fastest, a dedicated boot loader like etherboot or a general purpose kernel like Linux. In all the cases we measured at Los Alamos, Linux always won, easily: yes, slower to load than etherboot, more startup overhead, but once started Linux support for concurrency and parallelism always won the day. Loaders like etherboot (and its descendant, iPXE) spend most of their time doing nothing (as measured at the time). It was fun to boot 1000 nodes in the time it took PXE on one node to find a connected NIC.
The arguments over payload ended when the FLASH sockets changed to QFP and maxed at 256K and Linux could no longer fit.
But if your goal is fast boot, in fact if your goal is 800 miliseconds, we know this can work on slow ARMs with Linux, as was shown in 2006.
The very first ramstage was created because Linux could not correctly configure a PCI bus in 2000. The core of the ramstage as we know it was the PCI config.
We wanted to have ramstage only do PCI setup. We initially put SMP startup in Linux, which worked on all but K7, at which point ramstage took on SMP startup too. And ramstage started to grow. The growth has never stopped.
At what point is ramstage a kernel? I think at the point we add file systems or preemptive scheduling. We're getting dangerously close. If we really start to cross that boundary, it's time to rethink the ramstage in my view. It's not a good foundation for a kernel.
I've experimented with kernel-as-ramstage with harvey on the riscv and it worked. In this case, I manually removed the ramstage from coreboot.rom and replaced it with a kernel. It would be interesting, to me at least, to have a Kconfig option whereby we can replace the ramstage with some other ELF file, to aid such exploration.
I also wonder if we're not at a fork in the road in some ways. There are open systems, like RISCV, in which we have full control and can really get flexibility in how we boot. We can influence the RISCV vendors not to implement hardware designs that have negative impact on firmware and boot time performance. And then there are closed systems, like x86, in which many opportunities for optimization are lost, and we have little opportunity to impact hardware design. We also can't get very smart on x86 because the FSP boulder blocks the road.
Where do we go from here?
ron
-- coreboot mailing list: coreboot@coreboot.org https://www.coreboot.org/mailman/listinfo/coreboot