On 08/06/15 14:35, Stefan Hajnoczi wrote:
On Thu, Aug 6, 2015 at 12:02 PM, Marc MarĂ markmb@redhat.com wrote:
- if (qemu_cfg_dma_enabled()) {
QemuCfgDmaAccess access;
access.address = (u64)(u32)buf;
access.length = len;
access.control = QEMU_CFG_DMA_CTL_READ;
/*
* The out is done before the write of the variables on memory. This
* causes misread on the QEMU side.
*/
barrier();
outl((u32)&access, PORT_QEMU_CFG_DMA_ADDR);
I thought PORT_QEMU_CFG_DMA_ADDR is a 64-bit register according to the spec you posted?
while(access.length != 0 && !(access.control & QEMU_CFG_DMA_CTL_ERROR));
Either the field accesses need to be marked volatile, or a barrier is needed to force the compiler to reload these register from memory each iteration of the loop.
Better yet, I would recommend declaring (and here, also defining) "access" as "volatile".
volatile QemuCfgDmaAccess access;
I think the problem is that there are no "sequence points" (according to the C language specification) in this loop, so the compiler may assume that access.length and access.control will not change. https://en.wikipedia.org/wiki/Sequence_point
The loop does contain sequence points; three in total.
Appendix C of the ISO C spec gives an informative summary on the sequence points (with references to the individual normative passages). Quoting from appendix C:
1 The following are the sequence points described in 5.1.2.3:
[...]
- The end of the first operand of the following operators: logical AND && (6.5.13); [...]
[...]
- The end of a full expression: [...] the expression in an expression statement (6.8.3); [...] the controlling expression of a while or do statement (6.8.5); [...]
- Here we have the controlling expression of a while statement. - In that expression, we also have a logical AND operator. - And the null statement in the loop body is an expression statement.
The lack of sequence points is not the problem (we do have sequence points). The problem is that without "volatile", the compiler might optimize out the accesses, because it deduces that the expressions produce no side effects.
5.1.2.3 Program execution
[...]
2 Accessing a volatile object, [...] are all side effects, which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. [...]
3 In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).
[...]
5 The least requirements on a conforming implementation are:
- At sequence points, volatile objects are stable in the sense that previous accesses are complete and subsequent accesses have not yet occurred. [...]
[...]
6.7.3 Type qualifiers
[...]
6 An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine, as described in 5.1.2.3. Furthermore, at every sequence point the value last stored in the object shall agree with that prescribed by the abstract machine, except as modified by the unknown factors mentioned previously. [footnote 116] What constitutes an access to an object that has volatile-qualified type is implementation- defined.
Footnote 116 (informative):
A volatile declaration may be used to describe an object corresponding to a memory-mapped input/output port or an object accessed by an asynchronously interrupting function. Actions on objects so declared shall not be "optimized out" by an implementation or reordered except as permitted by the rules for evaluating expressions.
So, this covers both the initial population of "access", and the later reads.
(If "access" is declared "volatile", then even the current barrier() can be removed, which is otherwise
src/types.h:#define barrier() __asm__ __volatile__("": : :"memory")
ie. a compiler barrier.)
Thanks Laszlo