[coreboot] How is our SPI API supposed to work?

Fri Apr 20 00:06:41 CEST 2018

> > My question is basically what happens when bytesout and bytesin are both
> > non-zero. My previous understanding was always that the SPI controller

> It's up to the spi controller itself. They should support full duplex,
> if possible.

And if not? The problem is that right now xfer() means two different things
depending on which controller implementation you talk to. Isn't that a
pretty bad situation? The point of having an API is that you can program
something without having to care (too much) what implements it on the other
side. The behavior of the same call shouldn't change based on the
implementation, other than aborting for requests that the implementation
cannot support.

I think if we want to allow full-duplex, then any controllers that don't
support full-duplex should error out if their xfer() is called with two
buffers. They shouldn't just do something different than what the
controllers which support full-duplex would do if they were called the same
way. If the caller wants to transfer things one after the other, it needs
to use xfer_vector() with two vectors or do two separate xfer() calls.

> cmd *then* response behavior. As you noted you'd have to eat the cmd
> bytes in the receive buffer. The issue is that we are dealing with 2
> different types of spi *flash* controllers:
> 1. generic spi controller that can do any normal spi thing
> 2. spi flash controllers on x86 systems. The ones on x86 systems are
> not a normal spi controller, and they can't really be treated like
> one.

> The spi flash drivers previously made assumptions around #2 in that
> they would/could send din/dout in a full duplex manner assuming #2
> semantics -- basically very not normal spi.

Right. So I think the way spi_flash.c is written right now looks good, if
we want to consider transfers full-duplex. do_spi_flash_cmd() correctly
builds a vector with two separate transactions, one out and one in. My
problem is just that the Intel controllers then implement xfer_vector()
with spi_xfer_two_vectors() which just merges it back together into a
single xfer() call, with different semantics than what the other
(full-duplex-supporting) controllers use for xfer(). That just seems
wrong... xfer() is an external part of the SPI controller API so it should
have the same semantics for all controllers. If the Intel flash controllers
want to do something special with what do_spi_flash_cmd() gives them, they
should have their own custom implementation of xfer_vector() that doesn't
call xfer() and instead calls something internal to them which implements
this behavior, while xfer() should conform to the standard API (as best as
it can).

> mediatek does that for reasons I'm not aware of.

I think they just copied off the wrong implementation when writing that
driver and I didn't notice during review. Like I'm saying, I think the
current situation is very confusing, especially to those vendors who see it
for the first time. The MediaTek controller is definitely capable of
full-duplex AFAIK, the driver is written in a way that explicitly avoids
being full-duplex right now.

> That's just to combine the two ops spi flash cmds follow. Those
> helpers are there only for that purpose. If you look at the spi flash
> API buffers are passed in for getting data back, etc. There's no where
> to eat the setup cmd pieces w/o sending two ops down to the controller
> aside from realloc'ing a buffer with the larger size then memcpy()'ing
> back into the original buffer. Otherwise you are putting the burden on
> the higher layer callers to somehow know they need to pad their
> buffers. Also, the x86 spi flash controllers need all the info at once
> as a fused op.

Right, I get that those x86 flash controllers need to treat this specially.
I just think their xfer_vector() implementation shouldn't go back and call
xfer() like that (in a way that xfer() is normally not meant to work).
Their xfer_vector() should do the operation directly, and their xfer()
should be a different implementation that follows the normal API for xfer()
as best as it can (or if they can't do that at all they just shouldn't
provide an xfer() callback).

> One could never connect a generic spi device to an x86 spi *flash*
> controller.  That's similarly why is the special function
> flash_probe() in spi_ctrlr. It's to bypass the spi driver probing
> because it's not possible to probe the way the drivers are written.

Okay. It's fair that some controllers can only support very special use
cases. But the spi_ctrlr API overall is still a generic SPI API that may be
used for flash and for other things, so I think xfer() and xfer_vector()
always need to have the same semantics for everything that implements the
API. If a controller cannot implement anything that's not an xfer_vector
with exactly two transfers (one out and one in), it needs to return an
error if it's invoked in any other way. It shouldn't just do something
that's not conforming to the way other controllers would treat that call
instead.

> Speaking with my flashrom hat on, it's a bit more complicated.

> Duplex perspective:
> There are two main types (and a few subtypes) of SPI controllers used
> for flash:
> 1.) Full-duplex controllers: Capable of sending and receiving at the
> same time
> 1.a) For each clock cycle, a bit is sent and a bit is received at the
> same time (e.g. Raspberry Pi SPI)
> 1.b) For the first 8 clock cycles, a bit is sent. After that, for each
> clock cycle, a bit is sent and a bit is received at the same time (e.g.
> AMD SB600 and successors)
> 1.c) After a defined number of clock cycles, a constant number of bits
> is ignored in both directions (e.g. 8 bits ignored for FAST READ
> command, supported by some controllers)
> 2.) Half-duplex controllers: Capable of either sending or receiving at
> any given time
> 2.a) Send-then-receive: A number of bits (>0) is sent, and after that a
> number of bits (>=0) is received (e.g. Intel SPI)
> 2.b) After a defined number of clock cycles, a constant number of bits
> is ignored in both directions (e.g. 8 bits ignored for FAST READ
> command, supported by some controllers)

I think from this perspective, the coreboot API should always assume 1.a.
All other controllers need to interpret the API requests that way and try
to fulfill them with what they have. For example, that AMD controller that
can do full duplex for all but the first 8 cycles (1.b) would have to have
a custom xfer_vector() implementation that checks if it was given two
vectors, where the first one has 8 bytes out with no bytes in, and the
second one has N bytes out and N bytes in. It could fulfill that, and for
other requests that it can't fulfill it would have to return an error.

> Programming perspective:
> 1. Single-transaction programming: Send+receive count as well as send
> buffer and optionally receive buffer supplied in a single call, CS (chip
> select) happens automatically at the beginning and end of the transaction

This model can't use the normal SPI API at all and instead has to implement
a custom flash_probe() function where it can install its own spi_flash_ops
object. We have some controllers like that already.

> 2. Separate CS programming, single-transaction buffer handling: Activate
> CS, send+receive count as well as send buffer and optionally receive
> buffer supplied in a single call, deactivate CS

This one can use the SPI API and only has to implement xfer().
spi-generic.c will automatically turn spi_xfer_vector() calls into multiple
invocations of xfer() for it.

> 3. Separate CS programming, two-transaction buffer handling: Activate
> CS, send count+buffer in one call, optional receive count+buffer in one
> call, deactivate CS
> 4. Separate CS programming, multi-transaction buffer handling: Activate
> CS, send count+buffer in multiple calls, optional receive count+buffer
> in multiple calls, deactivate CS

Both of these can use the SPI API but need to implement their own
xfer_vector(). In there they could check whether the vectors they were
given match the transaction model they support, and if not they would have
to return an error. They should not implement the normal xfer() callback at
all, unless they can support a transaction with a single buffer.

> >> An SPI ROM read transaction packed into a single xfer() call (command
in
> >> dout, data read into din) would work fine with the second
interpretation
> >> but return garbage (first 4 bytes dropped and everything else shifted)
with
> >> the first interpretation.

> The other way round: It would return 4 bytes of garbage, then everything
> but the last 4 bytes shifted, and the last 4 bytes woud be dropped.

Right, thanks, it's that way around.