Re: [coreboot] Add coreboot storage driver - coreboot

14 Feb 2017


      On Tue, Feb 14, 2017 at 1:07 PM, Patrick Georgi pgeorgi@google.com wrote:
...
2017-02-14 17:12 GMT+01:00 Aaron Durbin via coreboot coreboot@coreboot.org:
...
For an optimized bootflow
all pieces of work that need to be done pretty much need to be closely
coupled. One needs to globally optimize the full sequence.
Like initializing slow hardware even before RAM init (as long as it's
just an initial command)?
How about using PIT/IRQ0 plus some tiny register-only interrupt
routine to do trivial register wrangling (we do have register
scripts)?
I don't think I properly understand your suggestion. For this
particular eMMC case are you suggesting taking the PIT interrupt and
doing the next piece of work in it?
...
...
that we seem to be absolutely needing to
maintain boot speeds. Is Chrome OS going against tide of coreboot
wanting to solve those sorts of issues?
The problem is that two basic qualities collide here: speed and
simplicity. The effect is that people ask to stop a second to
reconsider the options.
MPinit and parallelism are the "go to" solution for all performance
related issues of the last 10 years, but they're not without cost.
Questioning this approach doesn't mean that that we shouldn't go there
at all, just that the obvious answers might not lead to simple
solutions.
As Andrey stated elsewhere, we're far from CPU bound.
Agreed. But our chunking of work is very coarsely sectioned up. I
think the other CPU path is an attempt to work around the coarseness
of the work steps in the dependency chain.
...
For his concrete example: does eMMC init fail if you ping it more
often than every 10ms? It better not, you already stated that it's
hard to guarantee those 10ms, so there needs to be some spare room. We
could look at the largest chunk of init process that could be
restructured to implement cooperative multithreading on a single core
for as many tasks as possible, to cut down on all those udelays (or
even mdelays). Maybe we could even build a compiler plugin to ensure
at compile time that the resulting code is proper (loops either have
low bounds or are yielding, yield()/sched()/... aren't called within
critical sections)...
That's a possibility, but you have to solve the case for each
combination of hardware present and/or per platform. Building up the
dependency chain is the most important piece. And from there to ensure
execution context is not lost for longer than a set amount of time.
We're miles away from that since we're run to completion serially
right now.
...
Once we leave that scheduling to physics (ie enabling multicore
operation), all bets are off (or we have to synchronize the execution
to a degree that we could just as well do it manually). A lot of
complexity just to have 8 times the CPU power for the same amount of
IO bound tasks.
Patrick