Re: [coreboot] How is our SPI API supposed to work? - coreboot

19 Apr 2018

      On Thu, Apr 19, 2018 at 4:06 PM, Julius Werner jwerner@chromium.org wrote:
...
...
...
My question is basically what happens when bytesout and bytesin are both
non-zero. My previous understanding was always that the SPI controller
...
It's up to the spi controller itself. They should support full duplex,
if possible.
And if not? The problem is that right now xfer() means two different things
depending on which controller implementation you talk to. Isn't that a
pretty bad situation? The point of having an API is that you can program
something without having to care (too much) what implements it on the other
side. The behavior of the same call shouldn't change based on the
implementation, other than aborting for requests that the implementation
cannot support.
How does it mean two different things? The pairing of a controller and
device where a full duplex operation is needed but the controller
doesn't support things should indeed fail. I do not undertand how
xfer() means two different things. Whenever pairing hardware together
one needs to validate that things work when married.
...
I think if we want to allow full-duplex, then any controllers that don't
support full-duplex should error out if their xfer() is called with two
buffers. They shouldn't just do something different than what the
controllers which support full-duplex would do if they were called the same
way. If the caller wants to transfer things one after the other, it needs
to use xfer_vector() with two vectors or do two separate xfer() calls.
Sure.
...
...
cmd *then* response behavior. As you noted you'd have to eat the cmd
bytes in the receive buffer. The issue is that we are dealing with 2
different types of spi *flash* controllers:

generic spi controller that can do any normal spi thing
spi flash controllers on x86 systems. The ones on x86 systems are

not a normal spi controller, and they can't really be treated like
one.
...
The spi flash drivers previously made assumptions around #2 in that
they would/could send din/dout in a full duplex manner assuming #2
semantics -- basically very not normal spi.
Right. So I think the way spi_flash.c is written right now looks good, if
we want to consider transfers full-duplex. do_spi_flash_cmd() correctly
builds a vector with two separate transactions, one out and one in. My
problem is just that the Intel controllers then implement xfer_vector()
with spi_xfer_two_vectors() which just merges it back together into a
single xfer() call, with different semantics than what the other
(full-duplex-supporting) controllers use for xfer(). That just seems
wrong... xfer() is an external part of the SPI controller API so it should
have the same semantics for all controllers. If the Intel flash controllers
want to do something special with what do_spi_flash_cmd() gives them, they
should have their own custom implementation of xfer_vector() that doesn't
call xfer() and instead calls something internal to them which implements
this behavior, while xfer() should conform to the standard API (as best as
it can).
...
mediatek does that for reasons I'm not aware of.
I think they just copied off the wrong implementation when writing that
driver and I didn't notice during review. Like I'm saying, I think the
current situation is very confusing, especially to those vendors who see it
for the first time. The MediaTek controller is definitely capable of
full-duplex AFAIK, the driver is written in a way that explicitly avoids
being full-duplex right now.
...
That's just to combine the two ops spi flash cmds follow. Those
helpers are there only for that purpose. If you look at the spi flash
API buffers are passed in for getting data back, etc. There's no where
to eat the setup cmd pieces w/o sending two ops down to the controller
aside from realloc'ing a buffer with the larger size then memcpy()'ing
back into the original buffer. Otherwise you are putting the burden on
the higher layer callers to somehow know they need to pad their
buffers. Also, the x86 spi flash controllers need all the info at once
as a fused op.
Right, I get that those x86 flash controllers need to treat this specially.
I just think their xfer_vector() implementation shouldn't go back and call
xfer() like that (in a way that xfer() is normally not meant to work).
Their xfer_vector() should do the operation directly, and their xfer()
should be a different implementation that follows the normal API for xfer()
as best as it can (or if they can't do that at all they just shouldn't
provide an xfer() callback).
OK. It's just moving code around.
...
...
One could never connect a generic spi device to an x86 spi *flash*
controller.  That's similarly why is the special function
flash_probe() in spi_ctrlr. It's to bypass the spi driver probing
because it's not possible to probe the way the drivers are written.
Okay. It's fair that some controllers can only support very special use
cases. But the spi_ctrlr API overall is still a generic SPI API that may be
used for flash and for other things, so I think xfer() and xfer_vector()
always need to have the same semantics for everything that implements the
API. If a controller cannot implement anything that's not an xfer_vector
with exactly two transfers (one out and one in), it needs to return an
error if it's invoked in any other way. It shouldn't just do something
that's not conforming to the way other controllers would treat that call
instead.
Again, moving code around.
...
...
Speaking with my flashrom hat on, it's a bit more complicated.
...
Duplex perspective:
There are two main types (and a few subtypes) of SPI controllers used
for flash:
1.) Full-duplex controllers: Capable of sending and receiving at the
same time
1.a) For each clock cycle, a bit is sent and a bit is received at the
same time (e.g. Raspberry Pi SPI)
1.b) For the first 8 clock cycles, a bit is sent. After that, for each
clock cycle, a bit is sent and a bit is received at the same time (e.g.
AMD SB600 and successors)
1.c) After a defined number of clock cycles, a constant number of bits
is ignored in both directions (e.g. 8 bits ignored for FAST READ
command, supported by some controllers)
2.) Half-duplex controllers: Capable of either sending or receiving at
any given time
2.a) Send-then-receive: A number of bits (>0) is sent, and after that a
number of bits (>=0) is received (e.g. Intel SPI)
2.b) After a defined number of clock cycles, a constant number of bits
is ignored in both directions (e.g. 8 bits ignored for FAST READ
command, supported by some controllers)
I think from this perspective, the coreboot API should always assume 1.a.
All other controllers need to interpret the API requests that way and try
to fulfill them with what they have. For example, that AMD controller that
can do full duplex for all but the first 8 cycles (1.b) would have to have
a custom xfer_vector() implementation that checks if it was given two
vectors, where the first one has 8 bytes out with no bytes in, and the
second one has N bytes out and N bytes in. It could fulfill that, and for
other requests that it can't fulfill it would have to return an error.
...
Programming perspective:

Single-transaction programming: Send+receive count as well as send

buffer and optionally receive buffer supplied in a single call, CS (chip
select) happens automatically at the beginning and end of the transaction
This model can't use the normal SPI API at all and instead has to implement
a custom flash_probe() function where it can install its own spi_flash_ops
object. We have some controllers like that already.
...

Separate CS programming, single-transaction buffer handling: Activate

CS, send+receive count as well as send buffer and optionally receive
buffer supplied in a single call, deactivate CS
This one can use the SPI API and only has to implement xfer().
spi-generic.c will automatically turn spi_xfer_vector() calls into multiple
invocations of xfer() for it.
...

Separate CS programming, two-transaction buffer handling: Activate

CS, send count+buffer in one call, optional receive count+buffer in one
call, deactivate CS
4. Separate CS programming, multi-transaction buffer handling: Activate
CS, send count+buffer in multiple calls, optional receive count+buffer
in multiple calls, deactivate CS
Both of these can use the SPI API but need to implement their own
xfer_vector(). In there they could check whether the vectors they were
given match the transaction model they support, and if not they would have
to return an error. They should not implement the normal xfer() callback at
all, unless they can support a transaction with a single buffer.
...
...
...
An SPI ROM read transaction packed into a single xfer() call (command
in
...
...
...
dout, data read into din) would work fine with the second
interpretation
...
...
...
but return garbage (first 4 bytes dropped and everything else shifted)
with
...
...
...
the first interpretation.
...
The other way round: It would return 4 bytes of garbage, then everything
but the last 4 bytes shifted, and the last 4 bytes woud be dropped.
Right, thanks, it's that way around.