Hi,
Again, sorry for taking so long, but I just don't send stuff without looking through it. This is meant to go into Michael's PCI branch, if it does.
Some of the changes include: - some fixes (one thanks to David Gibson) and cleanups - macro magic for exporting clones of the DMA interface (e.g. pci_memory_read()); I hope it isn't too much a stretch - we use pci_memory_*() in most places where PCI devices are involved now - luckily we don't need unaligned accesses anymore - some attempt at signaling target aborts, but it doesn't seem like that stuff is completely implemented in the PCI layer / devices - PCI ids are defined in hw/amd_iommu.c until they get merged into Linux
Also, I can't answer every request that the API is extended for doing this and that more comfortably. I understand there may be corner cases, but may I suggest merging it (maybe into a separate branch related to mst's pci) so that everybody can deal with it? This is still labeled RFC, but if you think it's ready it can be merged.
I hope most of the important issues have been dealt with. I'll post the SeaBIOS patches soon (though I think you can give it a spin with the old ones, if you need). I'll also take care of submitting PCI ids to be merged into Linux.
In any case, let me know what you think. I hope I didn't forget to Cc someone.
Cheers, Eduard
Eduard - Gabriel Munteanu (13): Generic DMA memory access interface pci: add IOMMU support via the generic DMA layer AMD IOMMU emulation ide: use the DMA memory access interface for PCI IDE controllers rtl8139: use the DMA memory access interface eepro100: use the DMA memory access interface ac97: use the DMA memory access interface es1370: use the DMA memory access interface e1000: use the DMA memory access interface lsi53c895a: use the DMA memory access interface pcnet: use the DMA memory access interface usb-uhci: use the DMA memory access interface usb-ohci: use the DMA memory access interface
Makefile.target | 2 +- dma-helpers.c | 23 ++- dma.h | 4 +- hw/ac97.c | 6 +- hw/amd_iommu.c | 712 ++++++++++++++++++++++++++++++++++++++++++++++++++++ hw/dma_rw.c | 155 ++++++++++++ hw/dma_rw.h | 217 ++++++++++++++++ hw/e1000.c | 27 ++- hw/eepro100.c | 95 ++++---- hw/es1370.c | 4 +- hw/ide/ahci.c | 3 +- hw/ide/internal.h | 1 + hw/ide/macio.c | 4 +- hw/ide/pci.c | 18 +- hw/lsi53c895a.c | 24 +- hw/pc.c | 2 + hw/pci.c | 7 + hw/pci.h | 9 + hw/pci_internals.h | 1 + hw/pcnet-pci.c | 5 +- hw/rtl8139.c | 98 ++++---- hw/usb-ohci.c | 46 +++-- hw/usb-uhci.c | 26 +- 23 files changed, 1324 insertions(+), 165 deletions(-) create mode 100644 hw/amd_iommu.c create mode 100644 hw/dma_rw.c create mode 100644 hw/dma_rw.h
This introduces replacements for memory access functions like cpu_physical_memory_read(). The new interface can handle address translation and access checking through an IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- Makefile.target | 2 +- hw/dma_rw.c | 155 +++++++++++++++++++++++++++++++++++++++ hw/dma_rw.h | 217 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 373 insertions(+), 1 deletions(-) create mode 100644 hw/dma_rw.c create mode 100644 hw/dma_rw.h
diff --git a/Makefile.target b/Makefile.target index 21f864a..ee0c80d 100644 --- a/Makefile.target +++ b/Makefile.target @@ -224,7 +224,7 @@ obj-i386-y += cirrus_vga.o apic.o ioapic.o piix_pci.o obj-i386-y += vmport.o obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o obj-i386-y += debugcon.o multiboot.o -obj-i386-y += pc_piix.o kvmclock.o +obj-i386-y += pc_piix.o kvmclock.o dma_rw.o obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
# shared objects diff --git a/hw/dma_rw.c b/hw/dma_rw.c new file mode 100644 index 0000000..824db83 --- /dev/null +++ b/hw/dma_rw.c @@ -0,0 +1,155 @@ +/* + * Generic DMA memory access interface. + * + * Copyright (c) 2011 Eduard - Gabriel Munteanu + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "dma_rw.h" +#include "range.h" + +static void dma_register_memory_map(DMADevice *dev, + void *buffer, + dma_addr_t addr, + dma_addr_t len, + DMAInvalidateMapFunc *invalidate, + void *invalidate_opaque) +{ + DMAMemoryMap *map; + + map = qemu_malloc(sizeof(DMAMemoryMap)); + map->buffer = buffer; + map->addr = addr; + map->len = len; + map->invalidate = invalidate; + map->invalidate_opaque = invalidate_opaque; + + QLIST_INSERT_HEAD(&dev->mmu->memory_maps, map, list); +} + +static void dma_unregister_memory_map(DMADevice *dev, + void *buffer, + dma_addr_t len) +{ + DMAMemoryMap *map; + + QLIST_FOREACH(map, &dev->mmu->memory_maps, list) { + if (map->buffer == buffer && map->len == len) { + QLIST_REMOVE(map, list); + free(map); + } + } +} + +void dma_invalidate_memory_range(DMADevice *dev, + dma_addr_t addr, + dma_addr_t len) +{ + DMAMemoryMap *map; + + QLIST_FOREACH(map, &dev->mmu->memory_maps, list) { + if (ranges_overlap(addr, len, map->addr, map->len)) { + map->invalidate(map->invalidate_opaque); + QLIST_REMOVE(map, list); + free(map); + } + } +} + +void *dma_memory_map(DMADevice *dev, + DMAInvalidateMapFunc *cb, + void *opaque, + dma_addr_t addr, + dma_addr_t *len, + int is_write) +{ + int err; + target_phys_addr_t paddr, plen; + void *buf; + + if (!dev || !dev->mmu) { + return cpu_physical_memory_map(addr, len, is_write); + } + + plen = *len; + err = dev->mmu->translate(dev, addr, &paddr, &plen, is_write); + if (err) { + return NULL; + } + + /* + * If this is true, the virtual region is contiguous, + * but the translated physical region isn't. We just + * clamp *len, much like cpu_physical_memory_map() does. + */ + if (plen < *len) { + *len = plen; + } + + buf = cpu_physical_memory_map(paddr, len, is_write); + + /* We treat maps as remote TLBs to cope with stuff like AIO. */ + if (cb) { + dma_register_memory_map(dev, buf, addr, *len, cb, opaque); + } + + return buf; +} + +void dma_memory_unmap(DMADevice *dev, + void *buffer, + dma_addr_t len, + int is_write, + dma_addr_t access_len) +{ + cpu_physical_memory_unmap(buffer, len, is_write, access_len); + if (dev && dev->mmu) { + dma_unregister_memory_map(dev, buffer, len); + } +} + +void dma_memory_rw_iommu(DMADevice *dev, + dma_addr_t addr, + void *buf, + dma_addr_t len, + int is_write) +{ + dma_addr_t paddr, plen; + int err; + + while (len) { + err = dev->mmu->translate(dev, addr, &paddr, &plen, is_write); + if (err) { + return; + } + + /* The translation might be valid for larger regions. */ + if (plen > len) { + plen = len; + } + + cpu_physical_memory_rw(paddr, buf, plen, is_write); + + len -= plen; + addr += plen; + buf += plen; + } +} + diff --git a/hw/dma_rw.h b/hw/dma_rw.h new file mode 100644 index 0000000..39482cb --- /dev/null +++ b/hw/dma_rw.h @@ -0,0 +1,217 @@ +#ifndef DMA_RW_H +#define DMA_RW_H + +#include "qemu-common.h" + +typedef uint64_t dma_addr_t; + +typedef struct DMAMmu DMAMmu; +typedef struct DMADevice DMADevice; +typedef struct DMAMemoryMap DMAMemoryMap; + +typedef int DMATranslateFunc(DMADevice *dev, + dma_addr_t addr, + dma_addr_t *paddr, + dma_addr_t *len, + int is_write); + +typedef void DMAInvalidateMapFunc(void *); + +struct DMAMmu { + DeviceState *iommu; + DMATranslateFunc *translate; + QLIST_HEAD(memory_maps, DMAMemoryMap) memory_maps; +}; + +struct DMADevice { + DMAMmu *mmu; +}; + +struct DMAMemoryMap { + void *buffer; + dma_addr_t addr; + dma_addr_t len; + DMAInvalidateMapFunc *invalidate; + void *invalidate_opaque; + + QLIST_ENTRY(DMAMemoryMap) list; +}; + +void dma_memory_rw_iommu(DMADevice *dev, + dma_addr_t addr, + void *buf, + dma_addr_t len, + int is_write); + +static inline void dma_memory_rw(DMADevice *dev, + dma_addr_t addr, + void *buf, + dma_addr_t len, + int is_write) +{ + /* + * Fast-path non-iommu. + * More importantly, makes it obvious what this function does. + */ + if (!dev || !dev->mmu) { + cpu_physical_memory_rw(addr, buf, len, is_write); + return; + } + + dma_memory_rw_iommu(dev, addr, buf, len, is_write); +} + +static inline void dma_memory_read(DMADevice *dev, + dma_addr_t addr, + void *buf, + dma_addr_t len) +{ + dma_memory_rw(dev, addr, buf, len, 0); +} + +static inline void dma_memory_write(DMADevice *dev, + dma_addr_t addr, + const void *buf, + dma_addr_t len) +{ + dma_memory_rw(dev, addr, (void *) buf, len, 1); +} + +void *dma_memory_map(DMADevice *dev, + DMAInvalidateMapFunc *cb, + void *opaque, + dma_addr_t addr, + dma_addr_t *len, + int is_write); +void dma_memory_unmap(DMADevice *dev, + void *buffer, + dma_addr_t len, + int is_write, + dma_addr_t access_len); + +void dma_invalidate_memory_range(DMADevice *dev, + dma_addr_t addr, + dma_addr_t len); + + +/* + * All the following macro magic tries is to + * achieve some type safety and avoid duplication. + */ + +#define DEFINE_DMA_LD(prefix, suffix, devtype, dmafield, size) \ +static inline uint##size##_t \ +dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \ +{ \ + int err; \ + dma_addr_t paddr, plen; \ + \ + if (!dev || !dev->mmu) { \ + return ld##suffix##_phys(addr); \ + } \ + \ + err = dev->mmu->translate(dev, addr, &paddr, &plen, 0); \ + if (err || (plen < size / 8)) { \ + return 0; \ + } \ + \ + return ld##suffix##_phys(paddr); \ +} + +#define DEFINE_DMA_ST(prefix, suffix, devtype, dmafield, size) \ +static inline void \ +dma_st##suffix(DMADevice *dev, dma_addr_t addr, uint##size##_t val) \ +{ \ + int err; \ + target_phys_addr_t paddr, plen; \ + \ + if (!dev || !dev->mmu) { \ + st##suffix##_phys(addr, val); \ + return; \ + } \ + err = dev->mmu->translate(dev, addr, &paddr, &plen, 1); \ + if (err || (plen < size / 8)) { \ + return; \ + } \ + \ + st##suffix##_phys(paddr, val); \ +} + +#define DEFINE_DMA_MEMORY_RW(prefix, devtype, dmafield) +#define DEFINE_DMA_MEMORY_READ(prefix, devtype, dmafield) +#define DEFINE_DMA_MEMORY_WRITE(prefix, devtype, dmafield) + +#define DEFINE_DMA_OPS(prefix, devtype, dmafield) \ + /* \ + * FIXME: find a way to handle these: \ + * DEFINE_DMA_LD(prefix, ub, devtype, dmafield, 8) \ + * DEFINE_DMA_LD(prefix, uw, devtype, dmafield, 16) \ + */ \ + DEFINE_DMA_LD(prefix, l, devtype, dmafield, 32) \ + DEFINE_DMA_LD(prefix, q, devtype, dmafield, 64) \ + \ + DEFINE_DMA_ST(prefix, b, devtype, dmafield, 8) \ + DEFINE_DMA_ST(prefix, w, devtype, dmafield, 16) \ + DEFINE_DMA_ST(prefix, l, devtype, dmafield, 32) \ + DEFINE_DMA_ST(prefix, q, devtype, dmafield, 64) \ + \ + DEFINE_DMA_MEMORY_RW(prefix, devtype, dmafield) \ + DEFINE_DMA_MEMORY_READ(prefix, devtype, dmafield) \ + DEFINE_DMA_MEMORY_WRITE(prefix, devtype, dmafield) + +DEFINE_DMA_OPS(UNUSED, UNUSED, UNUSED) + +/* + * From here on, various bus interfaces can use DEFINE_DMA_OPS + * to summon their own personalized clone of the DMA interface. + */ + +#undef DEFINE_DMA_LD +#undef DEFINE_DMA_ST +#undef DEFINE_DMA_MEMORY_RW +#undef DEFINE_DMA_MEMORY_READ +#undef DEFINE_DMA_MEMORY_WRITE + +#define DEFINE_DMA_LD(prefix, suffix, devtype, dma_field, size) \ +static inline uint##size##_t \ +prefix##_ld##suffix(devtype *dev, dma_addr_t addr) \ +{ \ + return dma_ld##suffix(&dev->dma_field, addr); \ +} + +#define DEFINE_DMA_ST(prefix, suffix, devtype, dma_field, size) \ +static inline void \ +prefix##_st##suffix(devtype *dev, dma_addr_t addr, uint##size##_t val) \ +{ \ + dma_st##suffix(&dev->dma_field, addr, val); \ +} + +#define DEFINE_DMA_MEMORY_RW(prefix, devtype, dmafield) \ +static inline void prefix##_memory_rw(devtype *dev, \ + dma_addr_t addr, \ + void *buf, \ + dma_addr_t len, \ + int is_write) \ +{ \ + dma_memory_rw(&dev->dmafield, addr, buf, len, is_write); \ +} + +#define DEFINE_DMA_MEMORY_READ(prefix, devtype, dmafield) \ +static inline void prefix##_memory_read(devtype *dev, \ + dma_addr_t addr, \ + void *buf, \ + dma_addr_t len) \ +{ \ + dma_memory_read(&dev->dmafield, addr, buf, len); \ +} + +#define DEFINE_DMA_MEMORY_WRITE(prefix, devtype, dmafield) \ +static inline void prefix##_memory_write(devtype *dev, \ + dma_addr_t addr, \ + const void *buf, \ + dma_addr_t len) \ +{ \ + dma_memory_write(&dev->dmafield, addr, buf, len); \ +} + +#endif
On 05/31/2011 06:38 PM, Eduard - Gabriel Munteanu wrote:
+static inline void dma_memory_rw(DMADevice *dev,
dma_addr_t addr,
void *buf,
dma_addr_t len,
int is_write)
I don't think this needs to be inline...
+{
- /*
* Fast-path non-iommu.
* More importantly, makes it obvious what this function does.
*/
- if (!dev || !dev->mmu) {
cpu_physical_memory_rw(addr, buf, len, is_write);
return;
- }
... because you'll never be able to eliminate the if or the calls. You might as well make the overall code smaller by taking the entire function out of line.
+#define DEFINE_DMA_LD(prefix, suffix, devtype, dmafield, size) \ +static inline uint##size##_t \ +dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \ +{ \
- int err; \
- dma_addr_t paddr, plen; \
\
- if (!dev || !dev->mmu) { \
return ld##suffix##_phys(addr); \
- } \
Similarly for all the ld/st functions.
+#define DEFINE_DMA_MEMORY_RW(prefix, devtype, dmafield) +#define DEFINE_DMA_MEMORY_READ(prefix, devtype, dmafield) +#define DEFINE_DMA_MEMORY_WRITE(prefix, devtype, dmafield)
+#define DEFINE_DMA_OPS(prefix, devtype, dmafield) \
I think this is a bit over the top, really.
err = dev->mmu->translate(dev, addr, &paddr, &plen, is_write);
I see you didn't take my suggestion for using an opaque callback pointer. Really and truly, I won't be able to use this as-is for Alpha.
r~
On 06/01/2011 05:01 PM, Richard Henderson wrote:
err = dev->mmu->translate(dev, addr,&paddr,&plen, is_write);
I see you didn't take my suggestion for using an opaque callback pointer. Really and truly, I won't be able to use this as-is for Alpha.
Rather than opaques, please pass the DMA engine itself and use container_of().
We should be removing opaques, not adding them.
On 06/01/2011 07:29 AM, Avi Kivity wrote:
On 06/01/2011 05:01 PM, Richard Henderson wrote:
err = dev->mmu->translate(dev, addr,&paddr,&plen, is_write);
I see you didn't take my suggestion for using an opaque callback pointer. Really and truly, I won't be able to use this as-is for Alpha.
Rather than opaques, please pass the DMA engine itself and use container_of().
The dma engine object is currently sitting in the PCIBus structure. Which is private, and can't be extended by a host bridge implementation.
The entire code could be re-arranged, true, but please suggest something reasonable.
We should be removing opaques, not adding them.
See my followup elsewhere. Opaques *can* be cleaner than upcasting, particularly if there are too many hoops through which to jump.
r~
On Wed, Jun 01, 2011 at 08:16:44AM -0700, Richard Henderson wrote:
On 06/01/2011 07:29 AM, Avi Kivity wrote:
On 06/01/2011 05:01 PM, Richard Henderson wrote:
err = dev->mmu->translate(dev, addr,&paddr,&plen, is_write);
I see you didn't take my suggestion for using an opaque callback pointer. Really and truly, I won't be able to use this as-is for Alpha.
Rather than opaques, please pass the DMA engine itself and use container_of().
The dma engine object is currently sitting in the PCIBus structure. Which is private, and can't be extended by a host bridge implementation.
The entire code could be re-arranged, true, but please suggest something reasonable.
We should be removing opaques, not adding them.
See my followup elsewhere. Opaques *can* be cleaner than upcasting, particularly if there are too many hoops through which to jump.
So, in the meantime, I've also done a version of Eduard's earlier patches, with added support for the PAPR hypervisor managed IOMMU.
I have also significantly reworked how the structure lookup works, partly because in my case I'l looking at IOMMU translation for non-PCI devices, but I think it may also address your concerns. I'm still using upcasts, but there are less steps from the device to the IOMMU state.
I've been sick and haven't had a chance to merge my stuff with Eduard's changes. I'll post them anyway, as another discussion point.
On Wed, Jun 01, 2011 at 07:01:42AM -0700, Richard Henderson wrote:
On 05/31/2011 06:38 PM, Eduard - Gabriel Munteanu wrote:
+static inline void dma_memory_rw(DMADevice *dev,
dma_addr_t addr,
void *buf,
dma_addr_t len,
int is_write)
I don't think this needs to be inline...
+{
- /*
* Fast-path non-iommu.
* More importantly, makes it obvious what this function does.
*/
- if (!dev || !dev->mmu) {
cpu_physical_memory_rw(addr, buf, len, is_write);
return;
- }
... because you'll never be able to eliminate the if or the calls. You might as well make the overall code smaller by taking the entire function out of line.
+#define DEFINE_DMA_LD(prefix, suffix, devtype, dmafield, size) \ +static inline uint##size##_t \ +dma_ld##suffix(DMADevice *dev, dma_addr_t addr) \ +{ \
- int err; \
- dma_addr_t paddr, plen; \
\
- if (!dev || !dev->mmu) { \
return ld##suffix##_phys(addr); \
- } \
Similarly for all the ld/st functions.
The idea was to get to the fastpath as soon as possible. I'm not really concerned about the case where there's an IOMMU present, since translation/checking does a lot more work. But other people might be worried about that additional function call when there's no IOMMU.
And these functions are quite small anyway.
Thoughts, anybody else?
+#define DEFINE_DMA_MEMORY_RW(prefix, devtype, dmafield) +#define DEFINE_DMA_MEMORY_READ(prefix, devtype, dmafield) +#define DEFINE_DMA_MEMORY_WRITE(prefix, devtype, dmafield)
+#define DEFINE_DMA_OPS(prefix, devtype, dmafield) \
I think this is a bit over the top, really.
Yeah, it's a bit unconventional, but why do you think that?
The main selling point is there are more chances to screw up if every bus layer implements these manually. And it's really convenient, especially if we get to add another ld/st.
I do have one concern about it, though: it might increase compile time due to additional preprocessing work. I haven't done any benchmarks on that. But apart from this, are there any other objections?
err = dev->mmu->translate(dev, addr, &paddr, &plen, is_write);
I see you didn't take my suggestion for using an opaque callback pointer. Really and truly, I won't be able to use this as-is for Alpha.
If I understand correctly you need some sort of shared state between IOMMUs or units residing on different buses. Then you should be able to get to it even with this API, just like I do with my AMD IOMMU state by upcasting. It doesn't seem to matter whether you've got an opaque, that opaque could very well be reachable by upcasting.
Did I get this wrong?
Eduard
r~
On 06/01/2011 07:52 AM, Eduard - Gabriel Munteanu wrote:
The main selling point is there are more chances to screw up if every bus layer implements these manually. And it's really convenient, especially if we get to add another ld/st.
If we drop the ld/st, we're talking about 5 lines for every bus layer.
If I recall, there was just the one driver that actually uses the ld/st interface; most used the read/write interface.
If I understand correctly you need some sort of shared state between IOMMUs or units residing on different buses. Then you should be able to get to it even with this API, just like I do with my AMD IOMMU state by upcasting. It doesn't seem to matter whether you've got an opaque, that opaque could very well be reachable by upcasting.
Did I get this wrong?
Can you honestly tell me that
+static int amd_iommu_translate(DMADevice *dev,
dma_addr_t addr,
dma_addr_t *paddr,
dma_addr_t *len,
int is_write)
+{
- PCIDevice *pci_dev = container_of(dev, PCIDevice, dma);
- PCIDevice *iommu_dev = DO_UPCAST(PCIDevice, qdev, dev->mmu->iommu);
- AMDIOMMUState *s = DO_UPCAST(AMDIOMMUState, dev, iommu_dev);
THREE (3) upcasts is a sane to write maintainable software? The margin for error here is absolutely enormous.
If you had just passed in that AMDIOMMUState* as the opaque value, it would be trivial to look at the initialization statement and the callback function to verify that the right value is being passed.
r~
On Wed, Jun 01, 2011 at 08:09:29AM -0700, Richard Henderson wrote:
On 06/01/2011 07:52 AM, Eduard - Gabriel Munteanu wrote:
The main selling point is there are more chances to screw up if every bus layer implements these manually. And it's really convenient, especially if we get to add another ld/st.
If we drop the ld/st, we're talking about 5 lines for every bus layer.
If I recall, there was just the one driver that actually uses the ld/st interface; most used the read/write interface.
Hm, indeed there seem to be far fewer uses of those now, actually my patches don't seem to be using those.
What do you guys think? Will these go away completely?
If I understand correctly you need some sort of shared state between IOMMUs or units residing on different buses. Then you should be able to get to it even with this API, just like I do with my AMD IOMMU state by upcasting. It doesn't seem to matter whether you've got an opaque, that opaque could very well be reachable by upcasting.
Did I get this wrong?
Can you honestly tell me that
+static int amd_iommu_translate(DMADevice *dev,
dma_addr_t addr,
dma_addr_t *paddr,
dma_addr_t *len,
int is_write)
+{
- PCIDevice *pci_dev = container_of(dev, PCIDevice, dma);
- PCIDevice *iommu_dev = DO_UPCAST(PCIDevice, qdev, dev->mmu->iommu);
- AMDIOMMUState *s = DO_UPCAST(AMDIOMMUState, dev, iommu_dev);
THREE (3) upcasts is a sane to write maintainable software? The margin for error here is absolutely enormous.
If you had just passed in that AMDIOMMUState* as the opaque value, it would be trivial to look at the initialization statement and the callback function to verify that the right value is being passed.
Maybe it's not nice, but you're missing the fact upcasting gives you some type safety. With opaques you have none. Plus you also get the PCI device that made the call while you're at it.
Eduard
r~
On 06/01/2011 08:35 AM, Eduard - Gabriel Munteanu wrote:
Maybe it's not nice, but you're missing the fact upcasting gives you some type safety. With opaques you have none.
Lol. Do you understand what container_of does? This is not dynamic_cast<> with RTTI.
You can put any type name in there that you like, so long as it has a field name to match. The type of the field you give doesn't even have to match the type of the pointer that you pass in.
Type safety this is not.
r~
On Wed, Jun 01, 2011 at 08:45:56AM -0700, Richard Henderson wrote:
On 06/01/2011 08:35 AM, Eduard - Gabriel Munteanu wrote:
Maybe it's not nice, but you're missing the fact upcasting gives you some type safety. With opaques you have none.
Lol. Do you understand what container_of does? This is not dynamic_cast<> with RTTI.
You can put any type name in there that you like, so long as it has a field name to match. The type of the field you give doesn't even have to match the type of the pointer that you pass in.
Uh, if that's true, that's a bug in the container_of implementation. The ccan container_of implementation, for example, certainly does check that the given field has type matching the pointer.
IOMMUs can now be hooked onto the PCI bus. This makes use of the generic DMA layer.
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- hw/pci.c | 7 +++++++ hw/pci.h | 9 +++++++++ hw/pci_internals.h | 1 + 3 files changed, 17 insertions(+), 0 deletions(-)
diff --git a/hw/pci.c b/hw/pci.c index 0875654..7c8762c 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -745,6 +745,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus, return NULL; } pci_dev->bus = bus; + pci_dev->dma.mmu = &bus->mmu; pci_dev->devfn = devfn; pstrcpy(pci_dev->name, sizeof(pci_dev->name), name); pci_dev->irq_state = 0; @@ -2161,3 +2162,9 @@ int pci_qdev_find_device(const char *id, PCIDevice **pdev)
return rc; } + +void pci_register_iommu(PCIDevice *dev, DMATranslateFunc *translate) +{ + dev->bus->mmu.iommu = &dev->qdev; + dev->bus->mmu.translate = translate; +} diff --git a/hw/pci.h b/hw/pci.h index c6a6eb6..f3b51ec 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -5,6 +5,7 @@ #include "qobject.h"
#include "qdev.h" +#include "dma_rw.h"
/* PCI includes legacy ISA access. */ #include "isa.h" @@ -129,6 +130,10 @@ enum {
struct PCIDevice { DeviceState qdev; + + /* For devices which do DMA. */ + DMADevice dma; + /* PCI config space */ uint8_t *config;
@@ -271,6 +276,8 @@ void pci_bridge_update_mappings(PCIBus *b);
void pci_device_deassert_intx(PCIDevice *dev);
+void pci_register_iommu(PCIDevice *dev, DMATranslateFunc *translate); + static inline void pci_set_byte(uint8_t *config, uint8_t val) { @@ -475,4 +482,6 @@ static inline uint32_t pci_config_size(const PCIDevice *d) return pci_is_express(d) ? PCIE_CONFIG_SPACE_SIZE : PCI_CONFIG_SPACE_SIZE; }
+DEFINE_DMA_OPS(pci, PCIDevice, dma) + #endif diff --git a/hw/pci_internals.h b/hw/pci_internals.h index fbe1866..6452e8c 100644 --- a/hw/pci_internals.h +++ b/hw/pci_internals.h @@ -16,6 +16,7 @@ extern struct BusInfo pci_bus_info;
struct PCIBus { BusState qbus; + DMAMmu mmu; uint8_t devfn_min; pci_set_irq_fn set_irq; pci_map_irq_fn map_irq;
This introduces emulation for the AMD IOMMU, described in "AMD I/O Virtualization Technology (IOMMU) Specification".
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- Makefile.target | 2 +- hw/amd_iommu.c | 712 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ hw/pc.c | 2 + 3 files changed, 715 insertions(+), 1 deletions(-) create mode 100644 hw/amd_iommu.c
diff --git a/Makefile.target b/Makefile.target index ee0c80d..5f9c868 100644 --- a/Makefile.target +++ b/Makefile.target @@ -224,7 +224,7 @@ obj-i386-y += cirrus_vga.o apic.o ioapic.o piix_pci.o obj-i386-y += vmport.o obj-i386-y += device-hotplug.o pci-hotplug.o smbios.o wdt_ib700.o obj-i386-y += debugcon.o multiboot.o -obj-i386-y += pc_piix.o kvmclock.o dma_rw.o +obj-i386-y += pc_piix.o kvmclock.o dma_rw.o amd_iommu.o obj-i386-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o
# shared objects diff --git a/hw/amd_iommu.c b/hw/amd_iommu.c new file mode 100644 index 0000000..650a8f4 --- /dev/null +++ b/hw/amd_iommu.c @@ -0,0 +1,712 @@ +/* + * AMD IOMMU emulation + * + * Copyright (c) 2011 Eduard - Gabriel Munteanu + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "pc.h" +#include "hw.h" +#include "pci.h" +#include "qlist.h" +#include "dma_rw.h" + +/* Capability registers */ +#define CAPAB_HEADER 0x00 +#define CAPAB_REV_TYPE 0x02 +#define CAPAB_FLAGS 0x03 +#define CAPAB_BAR_LOW 0x04 +#define CAPAB_BAR_HIGH 0x08 +#define CAPAB_RANGE 0x0C +#define CAPAB_MISC 0x10 + +#define CAPAB_SIZE 0x14 +#define CAPAB_REG_SIZE 0x04 + +/* Capability header data */ +#define CAPAB_FLAG_IOTLBSUP (1 << 0) +#define CAPAB_FLAG_HTTUNNEL (1 << 1) +#define CAPAB_FLAG_NPCACHE (1 << 2) +#define CAPAB_INIT_REV (1 << 3) +#define CAPAB_INIT_TYPE 3 +#define CAPAB_INIT_REV_TYPE (CAPAB_REV | CAPAB_TYPE) +#define CAPAB_INIT_FLAGS (CAPAB_FLAG_NPCACHE | CAPAB_FLAG_HTTUNNEL) +#define CAPAB_INIT_MISC ((64 << 15) | (48 << 8)) +#define CAPAB_BAR_MASK (~((1UL << 14) - 1)) + +/* MMIO registers */ +#define MMIO_DEVICE_TABLE 0x0000 +#define MMIO_COMMAND_BASE 0x0008 +#define MMIO_EVENT_BASE 0x0010 +#define MMIO_CONTROL 0x0018 +#define MMIO_EXCL_BASE 0x0020 +#define MMIO_EXCL_LIMIT 0x0028 +#define MMIO_COMMAND_HEAD 0x2000 +#define MMIO_COMMAND_TAIL 0x2008 +#define MMIO_EVENT_HEAD 0x2010 +#define MMIO_EVENT_TAIL 0x2018 +#define MMIO_STATUS 0x2020 + +#define MMIO_SIZE 0x4000 + +#define MMIO_DEVTAB_SIZE_MASK ((1ULL << 12) - 1) +#define MMIO_DEVTAB_BASE_MASK (((1ULL << 52) - 1) & ~MMIO_DEVTAB_SIZE_MASK) +#define MMIO_DEVTAB_ENTRY_SIZE 32 +#define MMIO_DEVTAB_SIZE_UNIT 4096 + +#define MMIO_CMDBUF_SIZE_BYTE (MMIO_COMMAND_BASE + 7) +#define MMIO_CMDBUF_SIZE_MASK 0x0F +#define MMIO_CMDBUF_BASE_MASK MMIO_DEVTAB_BASE_MASK +#define MMIO_CMDBUF_DEFAULT_SIZE 8 +#define MMIO_CMDBUF_HEAD_MASK (((1ULL << 19) - 1) & ~0x0F) +#define MMIO_CMDBUF_TAIL_MASK MMIO_EVTLOG_HEAD_MASK + +#define MMIO_EVTLOG_SIZE_BYTE (MMIO_EVENT_BASE + 7) +#define MMIO_EVTLOG_SIZE_MASK MMIO_CMDBUF_SIZE_MASK +#define MMIO_EVTLOG_BASE_MASK MMIO_CMDBUF_BASE_MASK +#define MMIO_EVTLOG_DEFAULT_SIZE MMIO_CMDBUF_DEFAULT_SIZE +#define MMIO_EVTLOG_HEAD_MASK (((1ULL << 19) - 1) & ~0x0F) +#define MMIO_EVTLOG_TAIL_MASK MMIO_EVTLOG_HEAD_MASK + +#define MMIO_EXCL_BASE_MASK MMIO_DEVTAB_BASE_MASK +#define MMIO_EXCL_ENABLED_MASK (1ULL << 0) +#define MMIO_EXCL_ALLOW_MASK (1ULL << 1) +#define MMIO_EXCL_LIMIT_MASK MMIO_DEVTAB_BASE_MASK +#define MMIO_EXCL_LIMIT_LOW 0xFFF + +#define MMIO_CONTROL_IOMMUEN (1ULL << 0) +#define MMIO_CONTROL_HTTUNEN (1ULL << 1) +#define MMIO_CONTROL_EVENTLOGEN (1ULL << 2) +#define MMIO_CONTROL_EVENTINTEN (1ULL << 3) +#define MMIO_CONTROL_COMWAITINTEN (1ULL << 4) +#define MMIO_CONTROL_CMDBUFEN (1ULL << 12) + +#define MMIO_STATUS_EVTLOG_OF (1ULL << 0) +#define MMIO_STATUS_EVTLOG_INTR (1ULL << 1) +#define MMIO_STATUS_COMWAIT_INTR (1ULL << 2) +#define MMIO_STATUS_EVTLOG_RUN (1ULL << 3) +#define MMIO_STATUS_CMDBUF_RUN (1ULL << 4) + +#define CMDBUF_ID_BYTE 0x07 +#define CMDBUF_ID_RSHIFT 4 +#define CMDBUF_ENTRY_SIZE 0x10 + +#define CMD_COMPLETION_WAIT 0x01 +#define CMD_INVAL_DEVTAB_ENTRY 0x02 +#define CMD_INVAL_IOMMU_PAGES 0x03 +#define CMD_INVAL_IOTLB_PAGES 0x04 +#define CMD_INVAL_INTR_TABLE 0x05 + +#define DEVTAB_ENTRY_SIZE 32 + +/* Device table entry bits 0:63 */ +#define DEV_VALID (1ULL << 0) +#define DEV_TRANSLATION_VALID (1ULL << 1) +#define DEV_MODE_MASK 0x7 +#define DEV_MODE_RSHIFT 9 +#define DEV_PT_ROOT_MASK 0xFFFFFFFFFF000 +#define DEV_PT_ROOT_RSHIFT 12 +#define DEV_PERM_SHIFT 61 +#define DEV_PERM_READ (1ULL << 61) +#define DEV_PERM_WRITE (1ULL << 62) + +/* Device table entry bits 64:127 */ +#define DEV_DOMAIN_ID_MASK ((1ULL << 16) - 1) +#define DEV_IOTLB_SUPPORT (1ULL << 17) +#define DEV_SUPPRESS_PF (1ULL << 18) +#define DEV_SUPPRESS_ALL_PF (1ULL << 19) +#define DEV_IOCTL_MASK (~3) +#define DEV_IOCTL_RSHIFT 20 +#define DEV_IOCTL_DENY 0 +#define DEV_IOCTL_PASSTHROUGH 1 +#define DEV_IOCTL_TRANSLATE 2 +#define DEV_CACHE (1ULL << 37) +#define DEV_SNOOP_DISABLE (1ULL << 38) +#define DEV_EXCL (1ULL << 39) + +/* Event codes and flags, as stored in the info field */ +#define EVENT_ILLEGAL_DEVTAB_ENTRY (0x1U << 24) +#define EVENT_IOPF (0x2U << 24) +#define EVENT_IOPF_I (1U << 3) +#define EVENT_IOPF_PR (1U << 4) +#define EVENT_IOPF_RW (1U << 5) +#define EVENT_IOPF_PE (1U << 6) +#define EVENT_IOPF_RZ (1U << 7) +#define EVENT_IOPF_TR (1U << 8) +#define EVENT_DEV_TAB_HW_ERROR (0x3U << 24) +#define EVENT_PAGE_TAB_HW_ERROR (0x4U << 24) +#define EVENT_ILLEGAL_COMMAND_ERROR (0x5U << 24) +#define EVENT_COMMAND_HW_ERROR (0x6U << 24) +#define EVENT_IOTLB_INV_TIMEOUT (0x7U << 24) +#define EVENT_INVALID_DEV_REQUEST (0x8U << 24) + +#define EVENT_LEN 16 + +#define IOMMU_PERM_READ (1 << 0) +#define IOMMU_PERM_WRITE (1 << 1) +#define IOMMU_PERM_RW (IOMMU_PERM_READ | IOMMU_PERM_WRITE) + +/* FIXME: Remove these once they go upstream. */ +#define PCI_CLASS_SYSTEM_IOMMU 0x0806 +#define PCI_DEVICE_ID_AMD_IOMMU 0x0000 /* FIXME */ +#define PCI_CAP_ID_SEC 0x0F + +typedef struct AMDIOMMUState { + PCIDevice dev; + + int capab_offset; + unsigned char *capab; + + int mmio_index; + target_phys_addr_t mmio_addr; + unsigned char *mmio_buf; + int mmio_enabled; + + int enabled; + int ats_enabled; + + target_phys_addr_t devtab; + size_t devtab_len; + + target_phys_addr_t cmdbuf; + int cmdbuf_enabled; + size_t cmdbuf_len; + size_t cmdbuf_head; + size_t cmdbuf_tail; + int completion_wait_intr; + + target_phys_addr_t evtlog; + int evtlog_enabled; + int evtlog_intr; + target_phys_addr_t evtlog_len; + target_phys_addr_t evtlog_head; + target_phys_addr_t evtlog_tail; + + target_phys_addr_t excl_base; + target_phys_addr_t excl_limit; + int excl_enabled; + int excl_allow; +} AMDIOMMUState; + +typedef struct AMDIOMMUEvent { + uint16_t devfn; + uint16_t reserved; + uint16_t domid; + uint16_t info; + uint64_t addr; +} __attribute__((packed)) AMDIOMMUEvent; + +static void amd_iommu_completion_wait(AMDIOMMUState *s, + uint8_t *cmd) +{ + uint64_t addr; + + if (cmd[0] & 1) { + addr = le64_to_cpu(*(uint64_t *) cmd) & 0xFFFFFFFFFFFF8; + cpu_physical_memory_write(addr, cmd + 8, 8); + } + + if (cmd[0] & 2) { + s->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_COMWAIT_INTR; + } +} + +static void amd_iommu_invalidate_iotlb(AMDIOMMUState *s, + uint8_t *cmd) +{ + PCIDevice *dev; + PCIBus *bus = s->dev.bus; + int bus_num = pci_bus_num(bus); + int devfn = *(uint16_t *) cmd; + + dev = pci_find_device(bus, bus_num, devfn); + if (dev) { + dma_invalidate_memory_range(&dev->dma, 0, -1); + } +} + +static void amd_iommu_cmdbuf_exec(AMDIOMMUState *s) +{ + uint8_t cmd[16]; + int type; + + cpu_physical_memory_read(s->cmdbuf + s->cmdbuf_head, cmd, 16); + type = cmd[CMDBUF_ID_BYTE] >> CMDBUF_ID_RSHIFT; + switch (type) { + case CMD_COMPLETION_WAIT: + amd_iommu_completion_wait(s, cmd); + break; + case CMD_INVAL_DEVTAB_ENTRY: + break; + case CMD_INVAL_IOMMU_PAGES: + break; + case CMD_INVAL_IOTLB_PAGES: + amd_iommu_invalidate_iotlb(s, cmd); + break; + case CMD_INVAL_INTR_TABLE: + break; + default: + break; + } +} + +static void amd_iommu_cmdbuf_run(AMDIOMMUState *s) +{ + uint64_t *mmio_cmdbuf_head = (uint64_t *) s->mmio_buf + MMIO_COMMAND_HEAD; + + if (!s->cmdbuf_enabled) { + return; + } + + /* Check if there's work to do. */ + while (s->cmdbuf_head != s->cmdbuf_tail) { + /* Wrap head pointer. */ + if (s->cmdbuf_head >= s->cmdbuf_len * CMDBUF_ENTRY_SIZE) { + s->cmdbuf_head = 0; + } + + amd_iommu_cmdbuf_exec(s); + + /* Increment head pointer. */ + s->cmdbuf_head += CMDBUF_ENTRY_SIZE; + } + + *mmio_cmdbuf_head = cpu_to_le64(s->cmdbuf_head); +} + +static uint32_t amd_iommu_mmio_buf_read(AMDIOMMUState *s, + size_t offset, + size_t size) +{ + ssize_t i; + uint32_t ret; + + if (!size) { + return 0; + } + + ret = s->mmio_buf[offset + size - 1]; + for (i = size - 2; i >= 0; i--) { + ret <<= 8; + ret |= s->mmio_buf[offset + i]; + } + + return ret; +} + +static void amd_iommu_mmio_buf_write(AMDIOMMUState *s, + size_t offset, + size_t size, + uint32_t val) +{ + size_t i; + + for (i = 0; i < size; i++) { + s->mmio_buf[offset + i] = val & 0xFF; + val >>= 8; + } +} + +static void amd_iommu_update_mmio(AMDIOMMUState *s, + target_phys_addr_t addr) +{ + size_t reg = addr & ~0x07; + uint64_t *base = (uint64_t *) &s->mmio_buf[reg]; + uint64_t val = le64_to_cpu(*base); + + switch (reg) { + case MMIO_CONTROL: + s->enabled = !!(val & MMIO_CONTROL_IOMMUEN); + s->ats_enabled = !!(val & MMIO_CONTROL_HTTUNEN); + s->evtlog_enabled = s->enabled && + !!(val & MMIO_CONTROL_EVENTLOGEN); + s->evtlog_intr = !!(val & MMIO_CONTROL_EVENTINTEN); + s->completion_wait_intr = !!(val & MMIO_CONTROL_COMWAITINTEN); + s->cmdbuf_enabled = s->enabled && + !!(val & MMIO_CONTROL_CMDBUFEN); + + /* Update status flags depending on the control register. */ + if (s->cmdbuf_enabled) { + s->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_CMDBUF_RUN; + } else { + s->mmio_buf[MMIO_STATUS] &= ~MMIO_STATUS_CMDBUF_RUN; + } + if (s->evtlog_enabled) { + s->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_RUN; + } else { + s->mmio_buf[MMIO_STATUS] &= ~MMIO_STATUS_EVTLOG_RUN; + } + + amd_iommu_cmdbuf_run(s); + break; + case MMIO_DEVICE_TABLE: + s->devtab = (target_phys_addr_t) (val & MMIO_DEVTAB_BASE_MASK); + s->devtab_len = ((val & MMIO_DEVTAB_SIZE_MASK) + 1) * + (MMIO_DEVTAB_SIZE_UNIT / MMIO_DEVTAB_ENTRY_SIZE); + break; + case MMIO_COMMAND_BASE: + s->cmdbuf = (target_phys_addr_t) (val & MMIO_CMDBUF_BASE_MASK); + s->cmdbuf_len = 1UL << (s->mmio_buf[MMIO_CMDBUF_SIZE_BYTE] & + MMIO_CMDBUF_SIZE_MASK); + + /* We must reset the head and tail pointers. */ + s->cmdbuf_head = s->cmdbuf_tail = 0; + memset(s->mmio_buf + MMIO_COMMAND_HEAD, 0, 8); + memset(s->mmio_buf + MMIO_COMMAND_TAIL, 0, 8); + break; + case MMIO_COMMAND_HEAD: + s->cmdbuf_head = val & MMIO_CMDBUF_HEAD_MASK; + amd_iommu_cmdbuf_run(s); + break; + case MMIO_COMMAND_TAIL: + s->cmdbuf_tail = val & MMIO_CMDBUF_TAIL_MASK; + amd_iommu_cmdbuf_run(s); + break; + case MMIO_EVENT_BASE: + s->evtlog = (target_phys_addr_t) (val & MMIO_EVTLOG_BASE_MASK); + s->evtlog_len = 1UL << (s->mmio_buf[MMIO_EVTLOG_SIZE_BYTE] & + MMIO_EVTLOG_SIZE_MASK); + break; + case MMIO_EVENT_HEAD: + s->evtlog_head = val & MMIO_EVTLOG_HEAD_MASK; + break; + case MMIO_EVENT_TAIL: + s->evtlog_tail = val & MMIO_EVTLOG_TAIL_MASK; + break; + case MMIO_EXCL_BASE: + s->excl_base = (target_phys_addr_t) (val & MMIO_EXCL_BASE_MASK); + s->excl_enabled = val & MMIO_EXCL_ENABLED_MASK; + s->excl_allow = val & MMIO_EXCL_ALLOW_MASK; + break; + case MMIO_EXCL_LIMIT: + s->excl_limit = (target_phys_addr_t) ((val & MMIO_EXCL_LIMIT_MASK) | + MMIO_EXCL_LIMIT_LOW); + break; + default: + break; + } +} + +static uint32_t amd_iommu_mmio_readb(void *opaque, target_phys_addr_t addr) +{ + AMDIOMMUState *s = opaque; + + return amd_iommu_mmio_buf_read(s, addr, 1); +} + +static uint32_t amd_iommu_mmio_readw(void *opaque, target_phys_addr_t addr) +{ + AMDIOMMUState *s = opaque; + + return amd_iommu_mmio_buf_read(s, addr, 2); +} + +static uint32_t amd_iommu_mmio_readl(void *opaque, target_phys_addr_t addr) +{ + AMDIOMMUState *s = opaque; + + return amd_iommu_mmio_buf_read(s, addr, 4); +} + +static void amd_iommu_mmio_writeb(void *opaque, + target_phys_addr_t addr, + uint32_t val) +{ + AMDIOMMUState *s = opaque; + + amd_iommu_mmio_buf_write(s, addr, 1, val); + amd_iommu_update_mmio(s, addr); +} + +static void amd_iommu_mmio_writew(void *opaque, + target_phys_addr_t addr, + uint32_t val) +{ + AMDIOMMUState *s = opaque; + + amd_iommu_mmio_buf_write(s, addr, 2, val); + amd_iommu_update_mmio(s, addr); +} + +static void amd_iommu_mmio_writel(void *opaque, + target_phys_addr_t addr, + uint32_t val) +{ + AMDIOMMUState *s = opaque; + + amd_iommu_mmio_buf_write(s, addr, 4, val); + amd_iommu_update_mmio(s, addr); +} + +static CPUReadMemoryFunc * const amd_iommu_mmio_read[] = { + amd_iommu_mmio_readb, + amd_iommu_mmio_readw, + amd_iommu_mmio_readl, +}; + +static CPUWriteMemoryFunc * const amd_iommu_mmio_write[] = { + amd_iommu_mmio_writeb, + amd_iommu_mmio_writew, + amd_iommu_mmio_writel, +}; + +static void amd_iommu_enable_mmio(AMDIOMMUState *s) +{ + target_phys_addr_t addr; + uint8_t *capab_wmask = s->dev.wmask + s->capab_offset; + + s->mmio_index = cpu_register_io_memory(amd_iommu_mmio_read, + amd_iommu_mmio_write, + s, DEVICE_LITTLE_ENDIAN); + if (s->mmio_index < 0) { + return; + } + + addr = le64_to_cpu(*(uint64_t *) &s->capab[CAPAB_BAR_LOW]) & CAPAB_BAR_MASK; + cpu_register_physical_memory(addr, MMIO_SIZE, s->mmio_index); + + s->mmio_addr = addr; + s->mmio_enabled = 1; + + /* Further changes to the capability are prohibited. */ + memset(capab_wmask + CAPAB_BAR_LOW, 0x00, CAPAB_REG_SIZE); + memset(capab_wmask + CAPAB_BAR_HIGH, 0x00, CAPAB_REG_SIZE); +} + +static void amd_iommu_write_capab(PCIDevice *dev, + uint32_t addr, uint32_t val, int len) +{ + AMDIOMMUState *s = DO_UPCAST(AMDIOMMUState, dev, dev); + + pci_default_write_config(dev, addr, val, len); + + if (!s->mmio_enabled && s->capab[CAPAB_BAR_LOW] & 0x1) { + amd_iommu_enable_mmio(s); + } +} + +static void amd_iommu_reset(DeviceState *dev) +{ + AMDIOMMUState *s = DO_UPCAST(AMDIOMMUState, dev.qdev, dev); + unsigned char *capab = s->capab; + uint8_t *capab_wmask = s->dev.wmask + s->capab_offset; + + s->enabled = 0; + s->ats_enabled = 0; + s->mmio_enabled = 0; + + capab[CAPAB_REV_TYPE] = CAPAB_REV_TYPE; + capab[CAPAB_FLAGS] = CAPAB_FLAGS; + capab[CAPAB_BAR_LOW] = 0; + capab[CAPAB_BAR_HIGH] = 0; + capab[CAPAB_RANGE] = 0; + *((uint32_t *) &capab[CAPAB_MISC]) = cpu_to_le32(CAPAB_INIT_MISC); + + /* Changes to the capability are allowed after system reset. */ + memset(capab_wmask + CAPAB_BAR_LOW, 0xFF, CAPAB_REG_SIZE); + memset(capab_wmask + CAPAB_BAR_HIGH, 0xFF, CAPAB_REG_SIZE); + + memset(s->mmio_buf, 0, MMIO_SIZE); + s->mmio_buf[MMIO_CMDBUF_SIZE_BYTE] = MMIO_CMDBUF_DEFAULT_SIZE; + s->mmio_buf[MMIO_EVTLOG_SIZE_BYTE] = MMIO_EVTLOG_DEFAULT_SIZE; +} + +static void amd_iommu_log_event(AMDIOMMUState *s, AMDIOMMUEvent *evt) +{ + if (!s->evtlog_enabled || + (s->mmio_buf[MMIO_STATUS] | MMIO_STATUS_EVTLOG_OF)) { + return; + } + + if (s->evtlog_tail >= s->evtlog_len) { + s->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_OF; + } + + cpu_physical_memory_write(s->evtlog + s->evtlog_tail, + (uint8_t *) evt, EVENT_LEN); + + s->evtlog_tail += EVENT_LEN; + s->mmio_buf[MMIO_STATUS] |= MMIO_STATUS_EVTLOG_INTR; +} + +static void amd_iommu_page_fault(AMDIOMMUState *s, + int devfn, + unsigned domid, + target_phys_addr_t addr, + int present, + int is_write) +{ + AMDIOMMUEvent evt; + unsigned info; + uint16_t status; + + evt.devfn = cpu_to_le16(devfn); + evt.reserved = 0; + evt.domid = cpu_to_le16(domid); + evt.addr = cpu_to_le64(addr); + + info = EVENT_IOPF; + if (present) { + info |= EVENT_IOPF_PR; + } + if (is_write) { + info |= EVENT_IOPF_RW; + } + evt.info = cpu_to_le16(info); + + amd_iommu_log_event(s, &evt); + + /* + * Signal a target abort. + * + * FIXME: There should be a way to turn this off when acked. + */ + status = pci_get_word(s->dev.config + PCI_STATUS); + pci_set_word(s->dev.config + PCI_STATUS, + status | PCI_STATUS_SIG_TARGET_ABORT); +} + +static inline uint64_t amd_iommu_get_perms(uint64_t entry) +{ + return (entry & (DEV_PERM_READ | DEV_PERM_WRITE)) >> DEV_PERM_SHIFT; +} + +static inline AMDIOMMUState *amd_iommu_dma_to_state(DMADevice *dev) +{ + PCIDevice *pci_dev = DO_UPCAST(PCIDevice, qdev, dev->mmu->iommu); + + return DO_UPCAST(AMDIOMMUState, dev, pci_dev); +} + +static int amd_iommu_translate(DMADevice *dev, + dma_addr_t addr, + dma_addr_t *paddr, + dma_addr_t *len, + int is_write) +{ + PCIDevice *pci_dev = container_of(dev, PCIDevice, dma); + PCIDevice *iommu_dev = DO_UPCAST(PCIDevice, qdev, dev->mmu->iommu); + AMDIOMMUState *s = DO_UPCAST(AMDIOMMUState, dev, iommu_dev); + int devfn, present; + target_phys_addr_t entry_addr, pte_addr; + uint64_t entry[4], pte, page_offset, pte_perms; + unsigned level, domid; + unsigned perms; + + if (!s->enabled) { + goto no_translation; + } + + /* + * It's okay to check for either read or write permissions + * even for memory maps, since we don't support R/W maps. + */ + perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ; + + /* Get device table entry. */ + devfn = pci_dev->devfn; + entry_addr = s->devtab + devfn * DEVTAB_ENTRY_SIZE; + cpu_physical_memory_read(entry_addr, (uint8_t *) entry, 32); + + pte = entry[0]; + if (!(pte & DEV_VALID) || !(pte & DEV_TRANSLATION_VALID)) { + goto no_translation; + } + domid = entry[1] & DEV_DOMAIN_ID_MASK; + level = (pte >> DEV_MODE_RSHIFT) & DEV_MODE_MASK; + while (level > 0) { + /* + * Check permissions: the bitwise + * implication perms -> entry_perms must be true. + */ + pte_perms = amd_iommu_get_perms(pte); + present = pte & 1; + if (!present || perms != (perms & pte_perms)) { + amd_iommu_page_fault(s, devfn, domid, addr, + present, !!(perms & IOMMU_PERM_WRITE)); + return -EPERM; + } + + /* Go to the next lower level. */ + pte_addr = pte & DEV_PT_ROOT_MASK; + pte_addr += ((addr >> (3 + 9 * level)) & 0x1FF) << 3; + pte = ldq_phys(pte_addr); + level = (pte >> DEV_MODE_RSHIFT) & DEV_MODE_MASK; + } + page_offset = addr & 4095; + *paddr = (pte & DEV_PT_ROOT_MASK) + page_offset; + *len = 4096 - page_offset; + + return 0; + +no_translation: + *paddr = addr; + *len = -1; + return 0; +} + +static int amd_iommu_pci_initfn(PCIDevice *dev) +{ + AMDIOMMUState *s = DO_UPCAST(AMDIOMMUState, dev, dev); + + pci_config_set_vendor_id(s->dev.config, PCI_VENDOR_ID_AMD); + pci_config_set_device_id(s->dev.config, PCI_DEVICE_ID_AMD_IOMMU); + pci_config_set_class(s->dev.config, PCI_CLASS_SYSTEM_IOMMU); + + /* Secure Device capability */ + s->capab_offset = pci_add_capability(&s->dev, + PCI_CAP_ID_SEC, 0, CAPAB_SIZE); + s->capab = s->dev.config + s->capab_offset; + dev->config_write = amd_iommu_write_capab; + + /* Allocate backing space for the MMIO registers. */ + s->mmio_buf = qemu_malloc(MMIO_SIZE); + + pci_register_iommu(dev, amd_iommu_translate); + + return 0; +} + +static const VMStateDescription vmstate_amd_iommu = { + .name = "amd-iommu", + .version_id = 1, + .minimum_version_id = 1, + .minimum_version_id_old = 1, + .fields = (VMStateField []) { + VMSTATE_PCI_DEVICE(dev, AMDIOMMUState), + VMSTATE_END_OF_LIST() + } +}; + +static PCIDeviceInfo amd_iommu_pci_info = { + .qdev.name = "amd-iommu", + .qdev.desc = "AMD IOMMU", + .qdev.size = sizeof(AMDIOMMUState), + .qdev.reset = amd_iommu_reset, + .qdev.vmsd = &vmstate_amd_iommu, + .init = amd_iommu_pci_initfn, +}; + +static void amd_iommu_register(void) +{ + pci_qdev_register(&amd_iommu_pci_info); +} + +device_init(amd_iommu_register); diff --git a/hw/pc.c b/hw/pc.c index 6939c04..b93949c 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -1161,6 +1161,8 @@ void pc_pci_device_init(PCIBus *pci_bus) int max_bus; int bus;
+ pci_create_simple(pci_bus, -1, "amd-iommu"); + max_bus = drive_get_max_bus(IF_SCSI); for (bus = 0; bus <= max_bus; bus++) { pci_create_simple(pci_bus, -1, "lsi53c895a");
Emulated PCI IDE controllers now use the memory access interface. This also allows an emulated IOMMU to translate and check accesses.
Map invalidation results in cancelling DMA transfers. Since the guest OS can't properly recover the DMA results in case the mapping is changed, this is a fairly good approximation.
Note this doesn't handle AHCI emulation yet!
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- dma-helpers.c | 23 ++++++++++++++++++----- dma.h | 4 +++- hw/ide/ahci.c | 3 ++- hw/ide/internal.h | 1 + hw/ide/macio.c | 4 ++-- hw/ide/pci.c | 18 +++++++++++------- 6 files changed, 37 insertions(+), 16 deletions(-)
diff --git a/dma-helpers.c b/dma-helpers.c index 712ed89..29a74a4 100644 --- a/dma-helpers.c +++ b/dma-helpers.c @@ -10,12 +10,13 @@ #include "dma.h" #include "block_int.h"
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint) +void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DMADevice *dma) { qsg->sg = qemu_malloc(alloc_hint * sizeof(ScatterGatherEntry)); qsg->nsg = 0; qsg->nalloc = alloc_hint; qsg->size = 0; + qsg->dma = dma; }
void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base, @@ -73,12 +74,23 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs) int i;
for (i = 0; i < dbs->iov.niov; ++i) { - cpu_physical_memory_unmap(dbs->iov.iov[i].iov_base, - dbs->iov.iov[i].iov_len, !dbs->is_write, - dbs->iov.iov[i].iov_len); + dma_memory_unmap(dbs->sg->dma, + dbs->iov.iov[i].iov_base, + dbs->iov.iov[i].iov_len, !dbs->is_write, + dbs->iov.iov[i].iov_len); } }
+static void dma_bdrv_cancel(void *opaque) +{ + DMAAIOCB *dbs = opaque; + + bdrv_aio_cancel(dbs->acb); + dma_bdrv_unmap(dbs); + qemu_iovec_destroy(&dbs->iov); + qemu_aio_release(dbs); +} + static void dma_bdrv_cb(void *opaque, int ret) { DMAAIOCB *dbs = (DMAAIOCB *)opaque; @@ -100,7 +112,8 @@ static void dma_bdrv_cb(void *opaque, int ret) while (dbs->sg_cur_index < dbs->sg->nsg) { cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte; cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte; - mem = cpu_physical_memory_map(cur_addr, &cur_len, !dbs->is_write); + mem = dma_memory_map(dbs->sg->dma, dma_bdrv_cancel, dbs, + cur_addr, &cur_len, !dbs->is_write); if (!mem) break; qemu_iovec_add(&dbs->iov, mem, cur_len); diff --git a/dma.h b/dma.h index f3bb275..2417b32 100644 --- a/dma.h +++ b/dma.h @@ -14,6 +14,7 @@ //#include "cpu.h" #include "hw/hw.h" #include "block.h" +#include "hw/dma_rw.h"
typedef struct { target_phys_addr_t base; @@ -25,9 +26,10 @@ typedef struct { int nsg; int nalloc; target_phys_addr_t size; + DMADevice *dma; } QEMUSGList;
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint); +void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DMADevice *dma); void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base, target_phys_addr_t len); void qemu_sglist_destroy(QEMUSGList *qsg); diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index c6e0c77..68b87d2 100644 --- a/hw/ide/ahci.c +++ b/hw/ide/ahci.c @@ -680,7 +680,8 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist) if (sglist_alloc_hint > 0) { AHCI_SG *tbl = (AHCI_SG *)prdt;
- qemu_sglist_init(sglist, sglist_alloc_hint); + /* FIXME: pass a proper DMADevice. */ + qemu_sglist_init(sglist, sglist_alloc_hint, NULL); for (i = 0; i < sglist_alloc_hint; i++) { /* flags_size is zero-based */ qemu_sglist_add(sglist, le64_to_cpu(tbl[i].addr), diff --git a/hw/ide/internal.h b/hw/ide/internal.h index aa198b6..b830d67 100644 --- a/hw/ide/internal.h +++ b/hw/ide/internal.h @@ -474,6 +474,7 @@ struct IDEDMA { struct iovec iov; QEMUIOVector qiov; BlockDriverAIOCB *aiocb; + DMADevice *dev; };
struct IDEBus { diff --git a/hw/ide/macio.c b/hw/ide/macio.c index 7107f6b..a111481 100644 --- a/hw/ide/macio.c +++ b/hw/ide/macio.c @@ -78,7 +78,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
s->io_buffer_size = io->len;
- qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1); + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL); qemu_sglist_add(&s->sg, io->addr, io->len); io->addr += io->len; io->len = 0; @@ -140,7 +140,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret) s->io_buffer_index = 0; s->io_buffer_size = io->len;
- qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1); + qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL); qemu_sglist_add(&s->sg, io->addr, io->len); io->addr += io->len; io->len = 0; diff --git a/hw/ide/pci.c b/hw/ide/pci.c index 65cb56c..a14f2ae 100644 --- a/hw/ide/pci.c +++ b/hw/ide/pci.c @@ -63,7 +63,8 @@ static int bmdma_prepare_buf(IDEDMA *dma, int is_write) } prd; int l, len;
- qemu_sglist_init(&s->sg, s->nsector / (BMDMA_PAGE_SIZE / 512) + 1); + qemu_sglist_init(&s->sg, + s->nsector / (BMDMA_PAGE_SIZE / 512) + 1, dma->dev); s->io_buffer_size = 0; for(;;) { if (bm->cur_prd_len == 0) { @@ -71,7 +72,7 @@ static int bmdma_prepare_buf(IDEDMA *dma, int is_write) if (bm->cur_prd_last || (bm->cur_addr - bm->addr) >= BMDMA_PAGE_SIZE) return s->io_buffer_size != 0; - cpu_physical_memory_read(bm->cur_addr, (uint8_t *)&prd, 8); + dma_memory_read(dma->dev, bm->cur_addr, (uint8_t *)&prd, 8); bm->cur_addr += 8; prd.addr = le32_to_cpu(prd.addr); prd.size = le32_to_cpu(prd.size); @@ -113,7 +114,7 @@ static int bmdma_rw_buf(IDEDMA *dma, int is_write) if (bm->cur_prd_last || (bm->cur_addr - bm->addr) >= BMDMA_PAGE_SIZE) return 0; - cpu_physical_memory_read(bm->cur_addr, (uint8_t *)&prd, 8); + dma_memory_read(dma->dev, bm->cur_addr, (uint8_t *)&prd, 8); bm->cur_addr += 8; prd.addr = le32_to_cpu(prd.addr); prd.size = le32_to_cpu(prd.size); @@ -128,11 +129,11 @@ static int bmdma_rw_buf(IDEDMA *dma, int is_write) l = bm->cur_prd_len; if (l > 0) { if (is_write) { - cpu_physical_memory_write(bm->cur_prd_addr, - s->io_buffer + s->io_buffer_index, l); + dma_memory_write(dma->dev, bm->cur_prd_addr, + s->io_buffer + s->io_buffer_index, l); } else { - cpu_physical_memory_read(bm->cur_prd_addr, - s->io_buffer + s->io_buffer_index, l); + dma_memory_read(dma->dev, bm->cur_prd_addr, + s->io_buffer + s->io_buffer_index, l); } bm->cur_prd_addr += l; bm->cur_prd_len -= l; @@ -436,6 +437,9 @@ void pci_ide_create_devs(PCIDevice *dev, DriveInfo **hd_table) continue; ide_create_drive(d->bus+bus[i], unit[i], hd_table[i]); } + + d->bmdma[0].dma.dev = &dev->dma; + d->bmdma[1].dma.dev = &dev->dma; }
static const struct IDEDMAOps bmdma_ops = {
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- hw/rtl8139.c | 98 ++++++++++++++++++++++++++++++--------------------------- 1 files changed, 52 insertions(+), 46 deletions(-)
diff --git a/hw/rtl8139.c b/hw/rtl8139.c index c7c7a3c..5b70391 100644 --- a/hw/rtl8139.c +++ b/hw/rtl8139.c @@ -424,12 +424,6 @@ typedef struct RTL8139TallyCounters uint16_t TxUndrn; } RTL8139TallyCounters;
-/* Clears all tally counters */ -static void RTL8139TallyCounters_clear(RTL8139TallyCounters* counters); - -/* Writes tally counters to specified physical memory address */ -static void RTL8139TallyCounters_physical_memory_write(target_phys_addr_t tc_addr, RTL8139TallyCounters* counters); - typedef struct RTL8139State { PCIDevice dev; uint8_t phys[8]; /* mac address */ @@ -510,6 +504,14 @@ typedef struct RTL8139State { int rtl8139_mmio_io_addr_dummy; } RTL8139State;
+/* Clears all tally counters */ +static void RTL8139TallyCounters_clear(RTL8139TallyCounters* counters); + +/* Writes tally counters to specified physical memory address */ +static void +RTL8139TallyCounters_physical_memory_write(RTL8139State *s, + target_phys_addr_t tc_addr); + static void rtl8139_set_next_tctr_time(RTL8139State *s, int64_t current_time);
static void prom9346_decode_command(EEprom9346 *eeprom, uint8_t command) @@ -771,15 +773,15 @@ static void rtl8139_write_buffer(RTL8139State *s, const void *buf, int size)
if (size > wrapped) { - cpu_physical_memory_write( s->RxBuf + s->RxBufAddr, - buf, size-wrapped ); + pci_memory_write(&s->dev, s->RxBuf + s->RxBufAddr, + buf, size-wrapped); }
/* reset buffer pointer */ s->RxBufAddr = 0;
- cpu_physical_memory_write( s->RxBuf + s->RxBufAddr, - buf + (size-wrapped), wrapped ); + pci_memory_write(&s->dev, s->RxBuf + s->RxBufAddr, + buf + (size-wrapped), wrapped);
s->RxBufAddr = wrapped;
@@ -788,7 +790,7 @@ static void rtl8139_write_buffer(RTL8139State *s, const void *buf, int size) }
/* non-wrapping path or overwrapping enabled */ - cpu_physical_memory_write( s->RxBuf + s->RxBufAddr, buf, size ); + pci_memory_write(&s->dev, s->RxBuf + s->RxBufAddr, buf, size);
s->RxBufAddr += size; } @@ -828,6 +830,7 @@ static int rtl8139_can_receive(VLANClientState *nc) static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_t size_, int do_interrupt) { RTL8139State *s = DO_UPCAST(NICState, nc, nc)->opaque; + /* size is the length of the buffer passed to the driver */ int size = size_; const uint8_t *dot1q_buf = NULL; @@ -988,13 +991,13 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_
uint32_t val, rxdw0,rxdw1,rxbufLO,rxbufHI;
- cpu_physical_memory_read(cplus_rx_ring_desc, (uint8_t *)&val, 4); + pci_memory_read(&s->dev, cplus_rx_ring_desc, (uint8_t *)&val, 4); rxdw0 = le32_to_cpu(val); - cpu_physical_memory_read(cplus_rx_ring_desc+4, (uint8_t *)&val, 4); + pci_memory_read(&s->dev, cplus_rx_ring_desc+4, (uint8_t *)&val, 4); rxdw1 = le32_to_cpu(val); - cpu_physical_memory_read(cplus_rx_ring_desc+8, (uint8_t *)&val, 4); + pci_memory_read(&s->dev, cplus_rx_ring_desc+8, (uint8_t *)&val, 4); rxbufLO = le32_to_cpu(val); - cpu_physical_memory_read(cplus_rx_ring_desc+12, (uint8_t *)&val, 4); + pci_memory_read(&s->dev, cplus_rx_ring_desc+12, (uint8_t *)&val, 4); rxbufHI = le32_to_cpu(val);
DPRINTF("+++ C+ mode RX descriptor %d %08x %08x %08x %08x\n", @@ -1062,12 +1065,12 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_
/* receive/copy to target memory */ if (dot1q_buf) { - cpu_physical_memory_write(rx_addr, buf, 2 * ETHER_ADDR_LEN); - cpu_physical_memory_write(rx_addr + 2 * ETHER_ADDR_LEN, - buf + 2 * ETHER_ADDR_LEN + VLAN_HLEN, - size - 2 * ETHER_ADDR_LEN); + pci_memory_write(&s->dev, rx_addr, buf, 2 * ETHER_ADDR_LEN); + pci_memory_write(&s->dev, rx_addr + 2 * ETHER_ADDR_LEN, + buf + 2 * ETHER_ADDR_LEN + VLAN_HLEN, + size - 2 * ETHER_ADDR_LEN); } else { - cpu_physical_memory_write(rx_addr, buf, size); + pci_memory_write(&s->dev, rx_addr, buf, size); }
if (s->CpCmd & CPlusRxChkSum) @@ -1077,7 +1080,7 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_
/* write checksum */ val = cpu_to_le32(crc32(0, buf, size_)); - cpu_physical_memory_write( rx_addr+size, (uint8_t *)&val, 4); + pci_memory_write(&s->dev, rx_addr + size, (uint8_t *)&val, 4);
/* first segment of received packet flag */ #define CP_RX_STATUS_FS (1<<29) @@ -1123,9 +1126,9 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_
/* update ring data */ val = cpu_to_le32(rxdw0); - cpu_physical_memory_write(cplus_rx_ring_desc, (uint8_t *)&val, 4); + pci_memory_write(&s->dev, cplus_rx_ring_desc, (uint8_t *)&val, 4); val = cpu_to_le32(rxdw1); - cpu_physical_memory_write(cplus_rx_ring_desc+4, (uint8_t *)&val, 4); + pci_memory_write(&s->dev, cplus_rx_ring_desc+4, (uint8_t *)&val, 4);
/* update tally counter */ ++s->tally_counters.RxOk; @@ -1305,50 +1308,53 @@ static void RTL8139TallyCounters_clear(RTL8139TallyCounters* counters) counters->TxUndrn = 0; }
-static void RTL8139TallyCounters_physical_memory_write(target_phys_addr_t tc_addr, RTL8139TallyCounters* tally_counters) +static void +RTL8139TallyCounters_physical_memory_write(RTL8139State *s, + target_phys_addr_t tc_addr) { + RTL8139TallyCounters *tally_counters = &s->tally_counters; uint16_t val16; uint32_t val32; uint64_t val64;
val64 = cpu_to_le64(tally_counters->TxOk); - cpu_physical_memory_write(tc_addr + 0, (uint8_t *)&val64, 8); + pci_memory_write(&s->dev, tc_addr + 0, (uint8_t *)&val64, 8);
val64 = cpu_to_le64(tally_counters->RxOk); - cpu_physical_memory_write(tc_addr + 8, (uint8_t *)&val64, 8); + pci_memory_write(&s->dev, tc_addr + 8, (uint8_t *)&val64, 8);
val64 = cpu_to_le64(tally_counters->TxERR); - cpu_physical_memory_write(tc_addr + 16, (uint8_t *)&val64, 8); + pci_memory_write(&s->dev, tc_addr + 16, (uint8_t *)&val64, 8);
val32 = cpu_to_le32(tally_counters->RxERR); - cpu_physical_memory_write(tc_addr + 24, (uint8_t *)&val32, 4); + pci_memory_write(&s->dev, tc_addr + 24, (uint8_t *)&val32, 4);
val16 = cpu_to_le16(tally_counters->MissPkt); - cpu_physical_memory_write(tc_addr + 28, (uint8_t *)&val16, 2); + pci_memory_write(&s->dev, tc_addr + 28, (uint8_t *)&val16, 2);
val16 = cpu_to_le16(tally_counters->FAE); - cpu_physical_memory_write(tc_addr + 30, (uint8_t *)&val16, 2); + pci_memory_write(&s->dev, tc_addr + 30, (uint8_t *)&val16, 2);
val32 = cpu_to_le32(tally_counters->Tx1Col); - cpu_physical_memory_write(tc_addr + 32, (uint8_t *)&val32, 4); + pci_memory_write(&s->dev, tc_addr + 32, (uint8_t *)&val32, 4);
val32 = cpu_to_le32(tally_counters->TxMCol); - cpu_physical_memory_write(tc_addr + 36, (uint8_t *)&val32, 4); + pci_memory_write(&s->dev, tc_addr + 36, (uint8_t *)&val32, 4);
val64 = cpu_to_le64(tally_counters->RxOkPhy); - cpu_physical_memory_write(tc_addr + 40, (uint8_t *)&val64, 8); + pci_memory_write(&s->dev, tc_addr + 40, (uint8_t *)&val64, 8);
val64 = cpu_to_le64(tally_counters->RxOkBrd); - cpu_physical_memory_write(tc_addr + 48, (uint8_t *)&val64, 8); + pci_memory_write(&s->dev, tc_addr + 48, (uint8_t *)&val64, 8);
val32 = cpu_to_le32(tally_counters->RxOkMul); - cpu_physical_memory_write(tc_addr + 56, (uint8_t *)&val32, 4); + pci_memory_write(&s->dev, tc_addr + 56, (uint8_t *)&val32, 4);
val16 = cpu_to_le16(tally_counters->TxAbt); - cpu_physical_memory_write(tc_addr + 60, (uint8_t *)&val16, 2); + pci_memory_write(&s->dev, tc_addr + 60, (uint8_t *)&val16, 2);
val16 = cpu_to_le16(tally_counters->TxUndrn); - cpu_physical_memory_write(tc_addr + 62, (uint8_t *)&val16, 2); + pci_memory_write(&s->dev, tc_addr + 62, (uint8_t *)&val16, 2); }
/* Loads values of tally counters from VM state file */ @@ -1836,7 +1842,7 @@ static int rtl8139_transmit_one(RTL8139State *s, int descriptor) DPRINTF("+++ transmit reading %d bytes from host memory at 0x%08x\n", txsize, s->TxAddr[descriptor]);
- cpu_physical_memory_read(s->TxAddr[descriptor], txbuffer, txsize); + pci_memory_read(&s->dev, s->TxAddr[descriptor], txbuffer, txsize);
/* Mark descriptor as transferred */ s->TxStatus[descriptor] |= TxHostOwns; @@ -1969,13 +1975,13 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
uint32_t val, txdw0,txdw1,txbufLO,txbufHI;
- cpu_physical_memory_read(cplus_tx_ring_desc, (uint8_t *)&val, 4); + pci_memory_read(&s->dev, cplus_tx_ring_desc, (uint8_t *)&val, 4); txdw0 = le32_to_cpu(val); - cpu_physical_memory_read(cplus_tx_ring_desc+4, (uint8_t *)&val, 4); + pci_memory_read(&s->dev, cplus_tx_ring_desc+4, (uint8_t *)&val, 4); txdw1 = le32_to_cpu(val); - cpu_physical_memory_read(cplus_tx_ring_desc+8, (uint8_t *)&val, 4); + pci_memory_read(&s->dev, cplus_tx_ring_desc+8, (uint8_t *)&val, 4); txbufLO = le32_to_cpu(val); - cpu_physical_memory_read(cplus_tx_ring_desc+12, (uint8_t *)&val, 4); + pci_memory_read(&s->dev, cplus_tx_ring_desc+12, (uint8_t *)&val, 4); txbufHI = le32_to_cpu(val);
DPRINTF("+++ C+ mode TX descriptor %d %08x %08x %08x %08x\n", descriptor, @@ -2083,7 +2089,8 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s) TARGET_FMT_plx" to offset %d\n", txsize, tx_addr, s->cplus_txbuffer_offset);
- cpu_physical_memory_read(tx_addr, s->cplus_txbuffer + s->cplus_txbuffer_offset, txsize); + pci_memory_read(&s->dev, tx_addr, + s->cplus_txbuffer + s->cplus_txbuffer_offset, txsize); s->cplus_txbuffer_offset += txsize;
/* seek to next Rx descriptor */ @@ -2110,7 +2117,7 @@ static int rtl8139_cplus_transmit_one(RTL8139State *s)
/* update ring data */ val = cpu_to_le32(txdw0); - cpu_physical_memory_write(cplus_tx_ring_desc, (uint8_t *)&val, 4); + pci_memory_write(&s->dev, cplus_tx_ring_desc, (uint8_t *)&val, 4);
/* Now decide if descriptor being processed is holding the last segment of packet */ if (txdw0 & CP_TX_LS) @@ -2451,7 +2458,6 @@ static void rtl8139_transmit(RTL8139State *s)
static void rtl8139_TxStatus_write(RTL8139State *s, uint32_t txRegOffset, uint32_t val) { - int descriptor = txRegOffset/4;
/* handle C+ transmit mode register configuration */ @@ -2469,7 +2475,7 @@ static void rtl8139_TxStatus_write(RTL8139State *s, uint32_t txRegOffset, uint32 target_phys_addr_t tc_addr = rtl8139_addr64(s->TxStatus[0] & ~0x3f, s->TxStatus[1]);
/* dump tally counters to specified memory location */ - RTL8139TallyCounters_physical_memory_write( tc_addr, &s->tally_counters); + RTL8139TallyCounters_physical_memory_write(s, tc_addr);
/* mark dump completed */ s->TxStatus[0] &= ~0x8;
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- hw/eepro100.c | 95 ++++++++++++++++++++++++++++++-------------------------- 1 files changed, 51 insertions(+), 44 deletions(-)
diff --git a/hw/eepro100.c b/hw/eepro100.c index 05450e8..cd83da3 100644 --- a/hw/eepro100.c +++ b/hw/eepro100.c @@ -317,35 +317,39 @@ static const uint16_t eepro100_mdi_mask[] = { };
/* Read a 16 bit little endian value from physical memory. */ -static uint16_t e100_ldw_le_phys(target_phys_addr_t addr) +static uint16_t e100_ldw_le_phys(EEPRO100State *s, target_phys_addr_t addr) { /* Load 16 bit (little endian) word from emulated hardware. */ uint16_t val; - cpu_physical_memory_read(addr, &val, sizeof(val)); + pci_memory_read(&s->dev, addr, &val, sizeof(val)); return le16_to_cpu(val); }
/* Read a 32 bit little endian value from physical memory. */ -static uint32_t e100_ldl_le_phys(target_phys_addr_t addr) +static uint32_t e100_ldl_le_phys(EEPRO100State *s, target_phys_addr_t addr) { /* Load 32 bit (little endian) word from emulated hardware. */ uint32_t val; - cpu_physical_memory_read(addr, &val, sizeof(val)); + pci_memory_read(&s->dev, addr, &val, sizeof(val)); return le32_to_cpu(val); }
/* Write a 16 bit little endian value to physical memory. */ -static void e100_stw_le_phys(target_phys_addr_t addr, uint16_t val) +static void e100_stw_le_phys(EEPRO100State *s, + target_phys_addr_t addr, + uint16_t val) { val = cpu_to_le16(val); - cpu_physical_memory_write(addr, &val, sizeof(val)); + pci_memory_write(&s->dev, addr, &val, sizeof(val)); }
/* Write a 32 bit little endian value to physical memory. */ -static void e100_stl_le_phys(target_phys_addr_t addr, uint32_t val) +static void e100_stl_le_phys(EEPRO100State *s, + target_phys_addr_t addr, + uint32_t val) { val = cpu_to_le32(val); - cpu_physical_memory_write(addr, &val, sizeof(val)); + pci_memory_write(&s->dev, addr, &val, sizeof(val)); }
#define POLYNOMIAL 0x04c11db6 @@ -757,11 +761,11 @@ static void dump_statistics(EEPRO100State * s) * values which really matter. * Number of data should check configuration!!! */ - cpu_physical_memory_write(s->statsaddr, &s->statistics, s->stats_size); - e100_stl_le_phys(s->statsaddr + 0, s->statistics.tx_good_frames); - e100_stl_le_phys(s->statsaddr + 36, s->statistics.rx_good_frames); - e100_stl_le_phys(s->statsaddr + 48, s->statistics.rx_resource_errors); - e100_stl_le_phys(s->statsaddr + 60, s->statistics.rx_short_frame_errors); + pci_memory_write(&s->dev, s->statsaddr, &s->statistics, s->stats_size); + e100_stl_le_phys(s, s->statsaddr + 0, s->statistics.tx_good_frames); + e100_stl_le_phys(s, s->statsaddr + 36, s->statistics.rx_good_frames); + e100_stl_le_phys(s, s->statsaddr + 48, s->statistics.rx_resource_errors); + e100_stl_le_phys(s, s->statsaddr + 60, s->statistics.rx_short_frame_errors); #if 0 e100_stw_le_phys(s->statsaddr + 76, s->statistics.xmt_tco_frames); e100_stw_le_phys(s->statsaddr + 78, s->statistics.rcv_tco_frames); @@ -771,7 +775,7 @@ static void dump_statistics(EEPRO100State * s)
static void read_cb(EEPRO100State *s) { - cpu_physical_memory_read(s->cb_address, &s->tx, sizeof(s->tx)); + pci_memory_read(&s->dev, s->cb_address, &s->tx, sizeof(s->tx)); s->tx.status = le16_to_cpu(s->tx.status); s->tx.command = le16_to_cpu(s->tx.command); s->tx.link = le32_to_cpu(s->tx.link); @@ -801,18 +805,18 @@ static void tx_command(EEPRO100State *s) } assert(tcb_bytes <= sizeof(buf)); while (size < tcb_bytes) { - uint32_t tx_buffer_address = e100_ldl_le_phys(tbd_address); - uint16_t tx_buffer_size = e100_ldw_le_phys(tbd_address + 4); + uint32_t tx_buffer_address = e100_ldl_le_phys(s, tbd_address); + uint16_t tx_buffer_size = e100_ldw_le_phys(s, tbd_address + 4); #if 0 - uint16_t tx_buffer_el = e100_ldw_le_phys(tbd_address + 6); + uint16_t tx_buffer_el = e100_ldw_le_phys(s, tbd_address + 6); #endif tbd_address += 8; TRACE(RXTX, logout ("TBD (simplified mode): buffer address 0x%08x, size 0x%04x\n", tx_buffer_address, tx_buffer_size)); tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size); - cpu_physical_memory_read(tx_buffer_address, &buf[size], - tx_buffer_size); + pci_memory_read(&s->dev, + tx_buffer_address, &buf[size], tx_buffer_size); size += tx_buffer_size; } if (tbd_array == 0xffffffff) { @@ -823,16 +827,16 @@ static void tx_command(EEPRO100State *s) if (s->has_extended_tcb_support && !(s->configuration[6] & BIT(4))) { /* Extended Flexible TCB. */ for (; tbd_count < 2; tbd_count++) { - uint32_t tx_buffer_address = e100_ldl_le_phys(tbd_address); - uint16_t tx_buffer_size = e100_ldw_le_phys(tbd_address + 4); - uint16_t tx_buffer_el = e100_ldw_le_phys(tbd_address + 6); + uint32_t tx_buffer_address = e100_ldl_le_phys(s, tbd_address); + uint16_t tx_buffer_size = e100_ldw_le_phys(s, tbd_address + 4); + uint16_t tx_buffer_el = e100_ldw_le_phys(s, tbd_address + 6); tbd_address += 8; TRACE(RXTX, logout ("TBD (extended flexible mode): buffer address 0x%08x, size 0x%04x\n", tx_buffer_address, tx_buffer_size)); tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size); - cpu_physical_memory_read(tx_buffer_address, &buf[size], - tx_buffer_size); + pci_memory_read(&s->dev, + tx_buffer_address, &buf[size], tx_buffer_size); size += tx_buffer_size; if (tx_buffer_el & 1) { break; @@ -841,16 +845,16 @@ static void tx_command(EEPRO100State *s) } tbd_address = tbd_array; for (; tbd_count < s->tx.tbd_count; tbd_count++) { - uint32_t tx_buffer_address = e100_ldl_le_phys(tbd_address); - uint16_t tx_buffer_size = e100_ldw_le_phys(tbd_address + 4); - uint16_t tx_buffer_el = e100_ldw_le_phys(tbd_address + 6); + uint32_t tx_buffer_address = e100_ldl_le_phys(s, tbd_address); + uint16_t tx_buffer_size = e100_ldw_le_phys(s, tbd_address + 4); + uint16_t tx_buffer_el = e100_ldw_le_phys(s, tbd_address + 6); tbd_address += 8; TRACE(RXTX, logout ("TBD (flexible mode): buffer address 0x%08x, size 0x%04x\n", tx_buffer_address, tx_buffer_size)); tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size); - cpu_physical_memory_read(tx_buffer_address, &buf[size], - tx_buffer_size); + pci_memory_read(&s->dev, + tx_buffer_address, &buf[size], tx_buffer_size); size += tx_buffer_size; if (tx_buffer_el & 1) { break; @@ -875,7 +879,7 @@ static void set_multicast_list(EEPRO100State *s) TRACE(OTHER, logout("multicast list, multicast count = %u\n", multicast_count)); for (i = 0; i < multicast_count; i += 6) { uint8_t multicast_addr[6]; - cpu_physical_memory_read(s->cb_address + 10 + i, multicast_addr, 6); + pci_memory_read(&s->dev, s->cb_address + 10 + i, multicast_addr, 6); TRACE(OTHER, logout("multicast entry %s\n", nic_dump(multicast_addr, 6))); unsigned mcast_idx = compute_mcast_idx(multicast_addr); assert(mcast_idx < 64); @@ -909,12 +913,14 @@ static void action_command(EEPRO100State *s) /* Do nothing. */ break; case CmdIASetup: - cpu_physical_memory_read(s->cb_address + 8, &s->conf.macaddr.a[0], 6); + pci_memory_read(&s->dev, + s->cb_address + 8, &s->conf.macaddr.a[0], 6); TRACE(OTHER, logout("macaddr: %s\n", nic_dump(&s->conf.macaddr.a[0], 6))); break; case CmdConfigure: - cpu_physical_memory_read(s->cb_address + 8, &s->configuration[0], - sizeof(s->configuration)); + pci_memory_read(&s->dev, + s->cb_address + 8, + &s->configuration[0], sizeof(s->configuration)); TRACE(OTHER, logout("configuration: %s\n", nic_dump(&s->configuration[0], 16))); TRACE(OTHER, logout("configuration: %s\n", @@ -951,7 +957,8 @@ static void action_command(EEPRO100State *s) break; } /* Write new status. */ - e100_stw_le_phys(s->cb_address, s->tx.status | ok_status | STATUS_C); + e100_stw_le_phys(s, s->cb_address, + s->tx.status | ok_status | STATUS_C); if (bit_i) { /* CU completed action. */ eepro100_cx_interrupt(s); @@ -1018,7 +1025,7 @@ static void eepro100_cu_command(EEPRO100State * s, uint8_t val) /* Dump statistical counters. */ TRACE(OTHER, logout("val=0x%02x (dump stats)\n", val)); dump_statistics(s); - e100_stl_le_phys(s->statsaddr + s->stats_size, 0xa005); + e100_stl_le_phys(s, s->statsaddr + s->stats_size, 0xa005); break; case CU_CMD_BASE: /* Load CU base. */ @@ -1029,7 +1036,7 @@ static void eepro100_cu_command(EEPRO100State * s, uint8_t val) /* Dump and reset statistical counters. */ TRACE(OTHER, logout("val=0x%02x (dump stats and reset)\n", val)); dump_statistics(s); - e100_stl_le_phys(s->statsaddr + s->stats_size, 0xa007); + e100_stl_le_phys(s, s->statsaddr + s->stats_size, 0xa007); memset(&s->statistics, 0, sizeof(s->statistics)); break; case CU_SRESUME: @@ -1323,10 +1330,10 @@ static void eepro100_write_port(EEPRO100State *s) case PORT_SELFTEST: TRACE(OTHER, logout("selftest address=0x%08x\n", address)); eepro100_selftest_t data; - cpu_physical_memory_read(address, &data, sizeof(data)); + pci_memory_read(&s->dev, address, &data, sizeof(data)); data.st_sign = 0xffffffff; data.st_result = 0; - cpu_physical_memory_write(address, &data, sizeof(data)); + pci_memory_write(&s->dev, address, &data, sizeof(data)); break; case PORT_SELECTIVE_RESET: TRACE(OTHER, logout("selective reset, selftest address=0x%08x\n", address)); @@ -1853,8 +1860,8 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size } /* !!! */ eepro100_rx_t rx; - cpu_physical_memory_read(s->ru_base + s->ru_offset, &rx, - sizeof(eepro100_rx_t)); + pci_memory_read(&s->dev, + s->ru_base + s->ru_offset, &rx, sizeof(eepro100_rx_t)); uint16_t rfd_command = le16_to_cpu(rx.command); uint16_t rfd_size = le16_to_cpu(rx.size);
@@ -1870,9 +1877,9 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size #endif TRACE(OTHER, logout("command 0x%04x, link 0x%08x, addr 0x%08x, size %u\n", rfd_command, rx.link, rx.rx_buf_addr, rfd_size)); - e100_stw_le_phys(s->ru_base + s->ru_offset + + e100_stw_le_phys(s, s->ru_base + s->ru_offset + offsetof(eepro100_rx_t, status), rfd_status); - e100_stw_le_phys(s->ru_base + s->ru_offset + + e100_stw_le_phys(s, s->ru_base + s->ru_offset + offsetof(eepro100_rx_t, count), size); /* Early receive interrupt not supported. */ #if 0 @@ -1887,8 +1894,8 @@ static ssize_t nic_receive(VLANClientState *nc, const uint8_t * buf, size_t size #if 0 assert(!(s->configuration[17] & BIT(0))); #endif - cpu_physical_memory_write(s->ru_base + s->ru_offset + - sizeof(eepro100_rx_t), buf, size); + pci_memory_write(&s->dev, s->ru_base + s->ru_offset + + sizeof(eepro100_rx_t), buf, size); s->statistics.rx_good_frames++; eepro100_fr_interrupt(s); s->ru_offset = le32_to_cpu(rx.link);
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- hw/ac97.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/hw/ac97.c b/hw/ac97.c index d71072d..bad38fb 100644 --- a/hw/ac97.c +++ b/hw/ac97.c @@ -223,7 +223,7 @@ static void fetch_bd (AC97LinkState *s, AC97BusMasterRegs *r) { uint8_t b[8];
- cpu_physical_memory_read (r->bdbar + r->civ * 8, b, 8); + pci_memory_read (&s->dev, r->bdbar + r->civ * 8, b, 8); r->bd_valid = 1; r->bd.addr = le32_to_cpu (*(uint32_t *) &b[0]) & ~3; r->bd.ctl_len = le32_to_cpu (*(uint32_t *) &b[4]); @@ -972,7 +972,7 @@ static int write_audio (AC97LinkState *s, AC97BusMasterRegs *r, while (temp) { int copied; to_copy = audio_MIN (temp, sizeof (tmpbuf)); - cpu_physical_memory_read (addr, tmpbuf, to_copy); + pci_memory_read (&s->dev, addr, tmpbuf, to_copy); copied = AUD_write (s->voice_po, tmpbuf, to_copy); dolog ("write_audio max=%x to_copy=%x copied=%x\n", max, to_copy, copied); @@ -1056,7 +1056,7 @@ static int read_audio (AC97LinkState *s, AC97BusMasterRegs *r, *stop = 1; break; } - cpu_physical_memory_write (addr, tmpbuf, acquired); + pci_memory_write (&s->dev, addr, tmpbuf, acquired); temp -= acquired; addr += acquired; nread += acquired;
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- hw/es1370.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/hw/es1370.c b/hw/es1370.c index 40cb48c..1645dbd 100644 --- a/hw/es1370.c +++ b/hw/es1370.c @@ -802,7 +802,7 @@ static void es1370_transfer_audio (ES1370State *s, struct chan *d, int loop_sel, if (!acquired) break;
- cpu_physical_memory_write (addr, tmpbuf, acquired); + pci_memory_write (&s->dev, addr, tmpbuf, acquired);
temp -= acquired; addr += acquired; @@ -816,7 +816,7 @@ static void es1370_transfer_audio (ES1370State *s, struct chan *d, int loop_sel, int copied, to_copy;
to_copy = audio_MIN ((size_t) temp, sizeof (tmpbuf)); - cpu_physical_memory_read (addr, tmpbuf, to_copy); + pci_memory_read (&s->dev, addr, tmpbuf, to_copy); copied = AUD_write (voice, tmpbuf, to_copy); if (!copied) break;
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- hw/e1000.c | 27 +++++++++++++++------------ 1 files changed, 15 insertions(+), 12 deletions(-)
diff --git a/hw/e1000.c b/hw/e1000.c index f160bfc..acfd329 100644 --- a/hw/e1000.c +++ b/hw/e1000.c @@ -472,7 +472,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp) bytes = split_size; if (tp->size + bytes > msh) bytes = msh - tp->size; - cpu_physical_memory_read(addr, tp->data + tp->size, bytes); + pci_memory_read(&s->dev, addr, tp->data + tp->size, bytes); if ((sz = tp->size + bytes) >= hdr && tp->size < hdr) memmove(tp->header, tp->data, hdr); tp->size = sz; @@ -487,7 +487,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp) // context descriptor TSE is not set, while data descriptor TSE is set DBGOUT(TXERR, "TCP segmentaion Error\n"); } else { - cpu_physical_memory_read(addr, tp->data + tp->size, split_size); + pci_memory_read(&s->dev, addr, tp->data + tp->size, split_size); tp->size += split_size; }
@@ -503,7 +503,9 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp) }
static uint32_t -txdesc_writeback(target_phys_addr_t base, struct e1000_tx_desc *dp) +txdesc_writeback(E1000State *s, + target_phys_addr_t base, + struct e1000_tx_desc *dp) { uint32_t txd_upper, txd_lower = le32_to_cpu(dp->lower.data);
@@ -512,8 +514,9 @@ txdesc_writeback(target_phys_addr_t base, struct e1000_tx_desc *dp) txd_upper = (le32_to_cpu(dp->upper.data) | E1000_TXD_STAT_DD) & ~(E1000_TXD_STAT_EC | E1000_TXD_STAT_LC | E1000_TXD_STAT_TU); dp->upper.data = cpu_to_le32(txd_upper); - cpu_physical_memory_write(base + ((char *)&dp->upper - (char *)dp), - (void *)&dp->upper, sizeof(dp->upper)); + pci_memory_write(&s->dev, + base + ((char *)&dp->upper - (char *)dp), + (void *)&dp->upper, sizeof(dp->upper)); return E1000_ICR_TXDW; }
@@ -540,14 +543,14 @@ start_xmit(E1000State *s) while (s->mac_reg[TDH] != s->mac_reg[TDT]) { base = tx_desc_base(s) + sizeof(struct e1000_tx_desc) * s->mac_reg[TDH]; - cpu_physical_memory_read(base, (void *)&desc, sizeof(desc)); + pci_memory_read(&s->dev, base, (void *)&desc, sizeof(desc));
DBGOUT(TX, "index %d: %p : %x %x\n", s->mac_reg[TDH], (void *)(intptr_t)desc.buffer_addr, desc.lower.data, desc.upper.data);
process_tx_desc(s, &desc); - cause |= txdesc_writeback(base, &desc); + cause |= txdesc_writeback(s, base, &desc);
if (++s->mac_reg[TDH] * sizeof(desc) >= s->mac_reg[TDLEN]) s->mac_reg[TDH] = 0; @@ -717,7 +720,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size) desc_size = s->rxbuf_size; } base = rx_desc_base(s) + sizeof(desc) * s->mac_reg[RDH]; - cpu_physical_memory_read(base, (void *)&desc, sizeof(desc)); + pci_memory_read(&s->dev, base, (void *)&desc, sizeof(desc)); desc.special = vlan_special; desc.status |= (vlan_status | E1000_RXD_STAT_DD); if (desc.buffer_addr) { @@ -726,9 +729,9 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size) if (copy_size > s->rxbuf_size) { copy_size = s->rxbuf_size; } - cpu_physical_memory_write(le64_to_cpu(desc.buffer_addr), - (void *)(buf + desc_offset + vlan_offset), - copy_size); + pci_memory_write(&s->dev, le64_to_cpu(desc.buffer_addr), + (void *)(buf + desc_offset + vlan_offset), + copy_size); } desc_offset += desc_size; desc.length = cpu_to_le16(desc_size); @@ -742,7 +745,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size) } else { // as per intel docs; skip descriptors with null buf addr DBGOUT(RX, "Null RX descriptor!!\n"); } - cpu_physical_memory_write(base, (void *)&desc, sizeof(desc)); + pci_memory_write(&s->dev, base, (void *)&desc, sizeof(desc));
if (++s->mac_reg[RDH] * sizeof(desc) >= s->mac_reg[RDLEN]) s->mac_reg[RDH] = 0;
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- hw/lsi53c895a.c | 24 ++++++++++++------------ 1 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c index be4df58..7bfc604 100644 --- a/hw/lsi53c895a.c +++ b/hw/lsi53c895a.c @@ -394,7 +394,7 @@ static inline uint32_t read_dword(LSIState *s, uint32_t addr) if ((addr & 0xffffe000) == s->script_ram_base) { return s->script_ram[(addr & 0x1fff) >> 2]; } - cpu_physical_memory_read(addr, (uint8_t *)&buf, 4); + pci_memory_read(&s->dev, addr, (uint8_t *)&buf, 4); return cpu_to_le32(buf); }
@@ -574,9 +574,9 @@ static void lsi_do_dma(LSIState *s, int out)
/* ??? Set SFBR to first data byte. */ if (out) { - cpu_physical_memory_read(addr, s->current->dma_buf, count); + pci_memory_read(&s->dev, addr, s->current->dma_buf, count); } else { - cpu_physical_memory_write(addr, s->current->dma_buf, count); + pci_memory_write(&s->dev, addr, s->current->dma_buf, count); } s->current->dma_len -= count; if (s->current->dma_len == 0) { @@ -741,7 +741,7 @@ static void lsi_do_command(LSIState *s) DPRINTF("Send command len=%d\n", s->dbc); if (s->dbc > 16) s->dbc = 16; - cpu_physical_memory_read(s->dnad, buf, s->dbc); + pci_memory_read(&s->dev, s->dnad, buf, s->dbc); s->sfbr = buf[0]; s->command_complete = 0;
@@ -790,7 +790,7 @@ static void lsi_do_status(LSIState *s) s->dbc = 1; sense = s->sense; s->sfbr = sense; - cpu_physical_memory_write(s->dnad, &sense, 1); + pci_memory_write(&s->dev, s->dnad, &sense, 1); lsi_set_phase(s, PHASE_MI); s->msg_action = 1; lsi_add_msg_byte(s, 0); /* COMMAND COMPLETE */ @@ -804,7 +804,7 @@ static void lsi_do_msgin(LSIState *s) len = s->msg_len; if (len > s->dbc) len = s->dbc; - cpu_physical_memory_write(s->dnad, s->msg, len); + pci_memory_write(&s->dev, s->dnad, s->msg, len); /* Linux drivers rely on the last byte being in the SIDL. */ s->sidl = s->msg[len - 1]; s->msg_len -= len; @@ -836,7 +836,7 @@ static void lsi_do_msgin(LSIState *s) static uint8_t lsi_get_msgbyte(LSIState *s) { uint8_t data; - cpu_physical_memory_read(s->dnad, &data, 1); + pci_memory_read(&s->dev, s->dnad, &data, 1); s->dnad++; s->dbc--; return data; @@ -988,8 +988,8 @@ static void lsi_memcpy(LSIState *s, uint32_t dest, uint32_t src, int count) DPRINTF("memcpy dest 0x%08x src 0x%08x count %d\n", dest, src, count); while (count) { n = (count > LSI_BUF_SIZE) ? LSI_BUF_SIZE : count; - cpu_physical_memory_read(src, buf, n); - cpu_physical_memory_write(dest, buf, n); + pci_memory_read(&s->dev, src, buf, n); + pci_memory_write(&s->dev, dest, buf, n); src += n; dest += n; count -= n; @@ -1057,7 +1057,7 @@ again:
/* 32-bit Table indirect */ offset = sxt24(addr); - cpu_physical_memory_read(s->dsa + offset, (uint8_t *)buf, 8); + pci_memory_read(&s->dev, s->dsa + offset, (uint8_t *)buf, 8); /* byte count is stored in bits 0:23 only */ s->dbc = cpu_to_le32(buf[0]) & 0xffffff; s->rbc = s->dbc; @@ -1416,7 +1416,7 @@ again: n = (insn & 7); reg = (insn >> 16) & 0xff; if (insn & (1 << 24)) { - cpu_physical_memory_read(addr, data, n); + pci_memory_read(&s->dev, addr, data, n); DPRINTF("Load reg 0x%x size %d addr 0x%08x = %08x\n", reg, n, addr, *(int *)data); for (i = 0; i < n; i++) { @@ -1427,7 +1427,7 @@ again: for (i = 0; i < n; i++) { data[i] = lsi_reg_readb(s, reg + i); } - cpu_physical_memory_write(addr, data, n); + pci_memory_write(&s->dev, addr, data, n); } } }
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- hw/pcnet-pci.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/hw/pcnet-pci.c b/hw/pcnet-pci.c index 9415a1e..6c2186e 100644 --- a/hw/pcnet-pci.c +++ b/hw/pcnet-pci.c @@ -217,13 +217,13 @@ static CPUReadMemoryFunc * const pcnet_mmio_read[] = { static void pci_physical_memory_write(void *dma_opaque, target_phys_addr_t addr, uint8_t *buf, int len, int do_bswap) { - cpu_physical_memory_write(addr, buf, len); + pci_memory_write(dma_opaque, addr, buf, len); }
static void pci_physical_memory_read(void *dma_opaque, target_phys_addr_t addr, uint8_t *buf, int len, int do_bswap) { - cpu_physical_memory_read(addr, buf, len); + pci_memory_read(dma_opaque, addr, buf, len); }
static void pci_pcnet_cleanup(VLANClientState *nc) @@ -292,6 +292,7 @@ static int pci_pcnet_init(PCIDevice *pci_dev) s->irq = pci_dev->irq[0]; s->phys_mem_read = pci_physical_memory_read; s->phys_mem_write = pci_physical_memory_write; + s->dma_opaque = pci_dev;
if (!pci_dev->qdev.hotplugged) { static int loaded = 0;
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- hw/usb-uhci.c | 26 ++++++++++++++------------ 1 files changed, 14 insertions(+), 12 deletions(-)
diff --git a/hw/usb-uhci.c b/hw/usb-uhci.c index 346db3e..1661e54 100644 --- a/hw/usb-uhci.c +++ b/hw/usb-uhci.c @@ -713,7 +713,7 @@ static int uhci_complete_td(UHCIState *s, UHCI_TD *td, UHCIAsync *async, uint32_
if (len > 0) { /* write the data back */ - cpu_physical_memory_write(td->buffer, async->buffer, len); + dma_memory_write(&s->dev.dma, td->buffer, async->buffer, len); }
if ((td->ctrl & TD_CTRL_SPD) && len < max_len) { @@ -831,7 +831,7 @@ static int uhci_handle_td(UHCIState *s, uint32_t addr, UHCI_TD *td, uint32_t *in switch(pid) { case USB_TOKEN_OUT: case USB_TOKEN_SETUP: - cpu_physical_memory_read(td->buffer, async->buffer, max_len); + dma_memory_read(&s->dev.dma, td->buffer, async->buffer, max_len); len = uhci_broadcast_packet(s, &async->packet); if (len >= 0) len = max_len; @@ -874,7 +874,7 @@ static void uhci_async_complete(USBPacket *packet, void *opaque) uint32_t link = async->td; uint32_t int_mask = 0, val;
- cpu_physical_memory_read(link & ~0xf, (uint8_t *) &td, sizeof(td)); + dma_memory_read(&s->dev.dma, link & ~0xf, (uint8_t *) &td, sizeof(td)); le32_to_cpus(&td.link); le32_to_cpus(&td.ctrl); le32_to_cpus(&td.token); @@ -886,8 +886,8 @@ static void uhci_async_complete(USBPacket *packet, void *opaque)
/* update the status bits of the TD */ val = cpu_to_le32(td.ctrl); - cpu_physical_memory_write((link & ~0xf) + 4, - (const uint8_t *)&val, sizeof(val)); + dma_memory_write(&s->dev.dma, (link & ~0xf) + 4, + (const uint8_t *)&val, sizeof(val)); uhci_async_free(s, async); } else { async->done = 1; @@ -950,7 +950,7 @@ static void uhci_process_frame(UHCIState *s)
DPRINTF("uhci: processing frame %d addr 0x%x\n" , s->frnum, frame_addr);
- cpu_physical_memory_read(frame_addr, (uint8_t *)&link, 4); + dma_memory_read(&s->dev.dma, frame_addr, (uint8_t *)&link, 4); le32_to_cpus(&link);
int_mask = 0; @@ -974,7 +974,8 @@ static void uhci_process_frame(UHCIState *s) break; }
- cpu_physical_memory_read(link & ~0xf, (uint8_t *) &qh, sizeof(qh)); + dma_memory_read(&s->dev.dma, + link & ~0xf, (uint8_t *) &qh, sizeof(qh)); le32_to_cpus(&qh.link); le32_to_cpus(&qh.el_link);
@@ -994,7 +995,8 @@ static void uhci_process_frame(UHCIState *s) }
/* TD */ - cpu_physical_memory_read(link & ~0xf, (uint8_t *) &td, sizeof(td)); + dma_memory_read(&s->dev.dma, + link & ~0xf, (uint8_t *) &td, sizeof(td)); le32_to_cpus(&td.link); le32_to_cpus(&td.ctrl); le32_to_cpus(&td.token); @@ -1008,8 +1010,8 @@ static void uhci_process_frame(UHCIState *s) if (old_td_ctrl != td.ctrl) { /* update the status bits of the TD */ val = cpu_to_le32(td.ctrl); - cpu_physical_memory_write((link & ~0xf) + 4, - (const uint8_t *)&val, sizeof(val)); + dma_memory_write(&s->dev.dma, (link & ~0xf) + 4, + (const uint8_t *)&val, sizeof(val)); }
if (ret < 0) { @@ -1037,8 +1039,8 @@ static void uhci_process_frame(UHCIState *s) /* update QH element link */ qh.el_link = link; val = cpu_to_le32(qh.el_link); - cpu_physical_memory_write((curr_qh & ~0xf) + 4, - (const uint8_t *)&val, sizeof(val)); + dma_memory_write(&s->dev.dma, (curr_qh & ~0xf) + 4, + (const uint8_t *)&val, sizeof(val));
if (!depth_first(link)) { /* done with this QH */
This allows the device to work properly with an emulated IOMMU.
Signed-off-by: Eduard - Gabriel Munteanu eduard.munteanu@linux360.ro --- hw/usb-ohci.c | 46 +++++++++++++++++++++++++++++++--------------- 1 files changed, 31 insertions(+), 15 deletions(-)
diff --git a/hw/usb-ohci.c b/hw/usb-ohci.c index d21c820..9d1cf91 100644 --- a/hw/usb-ohci.c +++ b/hw/usb-ohci.c @@ -116,6 +116,11 @@ typedef struct {
} OHCIState;
+typedef struct { + PCIDevice pci_dev; + OHCIState state; +} OHCIPCIState; + /* Host Controller Communications Area */ struct ohci_hcca { uint32_t intr[32]; @@ -422,12 +427,13 @@ static void ohci_reset(void *opaque) static inline int get_dwords(OHCIState *ohci, uint32_t addr, uint32_t *buf, int num) { + OHCIPCIState *s = container_of(ohci, OHCIPCIState, state); int i;
addr += ohci->localmem_base;
for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) { - cpu_physical_memory_read(addr, buf, sizeof(*buf)); + pci_memory_read(&s->pci_dev, addr, buf, sizeof(*buf)); *buf = le32_to_cpu(*buf); }
@@ -438,13 +444,14 @@ static inline int get_dwords(OHCIState *ohci, static inline int put_dwords(OHCIState *ohci, uint32_t addr, uint32_t *buf, int num) { + OHCIPCIState *s = container_of(ohci, OHCIPCIState, state); int i;
addr += ohci->localmem_base;
for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) { uint32_t tmp = cpu_to_le32(*buf); - cpu_physical_memory_write(addr, &tmp, sizeof(tmp)); + pci_memory_write(&s->pci_dev, addr, &tmp, sizeof(tmp)); }
return 1; @@ -454,12 +461,13 @@ static inline int put_dwords(OHCIState *ohci, static inline int get_words(OHCIState *ohci, uint32_t addr, uint16_t *buf, int num) { + OHCIPCIState *s = container_of(ohci, OHCIPCIState, state); int i;
addr += ohci->localmem_base;
for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) { - cpu_physical_memory_read(addr, buf, sizeof(*buf)); + pci_memory_read(&s->pci_dev, addr, buf, sizeof(*buf)); *buf = le16_to_cpu(*buf); }
@@ -470,13 +478,14 @@ static inline int get_words(OHCIState *ohci, static inline int put_words(OHCIState *ohci, uint32_t addr, uint16_t *buf, int num) { + OHCIPCIState *s = container_of(ohci, OHCIPCIState, state); int i;
addr += ohci->localmem_base;
for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) { uint16_t tmp = cpu_to_le16(*buf); - cpu_physical_memory_write(addr, &tmp, sizeof(tmp)); + pci_memory_write(&s->pci_dev, addr, &tmp, sizeof(tmp)); }
return 1; @@ -504,7 +513,11 @@ static inline int ohci_read_iso_td(OHCIState *ohci, static inline int ohci_read_hcca(OHCIState *ohci, uint32_t addr, struct ohci_hcca *hcca) { - cpu_physical_memory_read(addr + ohci->localmem_base, hcca, sizeof(*hcca)); + OHCIPCIState *s = container_of(ohci, OHCIPCIState, state); + + pci_memory_read(&s->pci_dev, + addr + ohci->localmem_base, hcca, sizeof(*hcca)); + return 1; }
@@ -530,7 +543,11 @@ static inline int ohci_put_iso_td(OHCIState *ohci, static inline int ohci_put_hcca(OHCIState *ohci, uint32_t addr, struct ohci_hcca *hcca) { - cpu_physical_memory_write(addr + ohci->localmem_base, hcca, sizeof(*hcca)); + OHCIPCIState *s = container_of(ohci, OHCIPCIState, state); + + pci_memory_write(&s->pci_dev, + addr + ohci->localmem_base, hcca, sizeof(*hcca)); + return 1; }
@@ -538,6 +555,8 @@ static inline int ohci_put_hcca(OHCIState *ohci, static void ohci_copy_td(OHCIState *ohci, struct ohci_td *td, uint8_t *buf, int len, int write) { + OHCIPCIState *s = container_of(ohci, OHCIPCIState, state); + DMADevice *dma = &s->pci_dev.dma; uint32_t ptr; uint32_t n;
@@ -545,12 +564,12 @@ static void ohci_copy_td(OHCIState *ohci, struct ohci_td *td, n = 0x1000 - (ptr & 0xfff); if (n > len) n = len; - cpu_physical_memory_rw(ptr + ohci->localmem_base, buf, n, write); + dma_memory_rw(dma, ptr + ohci->localmem_base, buf, n, write); if (n == len) return; ptr = td->be & ~0xfffu; buf += n; - cpu_physical_memory_rw(ptr + ohci->localmem_base, buf, len - n, write); + dma_memory_rw(dma, ptr + ohci->localmem_base, buf, len - n, write); }
/* Read/Write the contents of an ISO TD from/to main memory. */ @@ -558,6 +577,8 @@ static void ohci_copy_iso_td(OHCIState *ohci, uint32_t start_addr, uint32_t end_addr, uint8_t *buf, int len, int write) { + OHCIPCIState *s = container_of(ohci, OHCIPCIState, state); + DMADevice *dma = &s->pci_dev.dma; uint32_t ptr; uint32_t n;
@@ -565,12 +586,12 @@ static void ohci_copy_iso_td(OHCIState *ohci, n = 0x1000 - (ptr & 0xfff); if (n > len) n = len; - cpu_physical_memory_rw(ptr + ohci->localmem_base, buf, n, write); + dma_memory_rw(dma, ptr + ohci->localmem_base, buf, n, write); if (n == len) return; ptr = end_addr & ~0xfffu; buf += n; - cpu_physical_memory_rw(ptr + ohci->localmem_base, buf, len - n, write); + dma_memory_rw(dma, ptr + ohci->localmem_base, buf, len - n, write); }
static void ohci_process_lists(OHCIState *ohci, int completion); @@ -1706,11 +1727,6 @@ static void usb_ohci_init(OHCIState *ohci, DeviceState *dev, qemu_register_reset(ohci_reset, ohci); }
-typedef struct { - PCIDevice pci_dev; - OHCIState state; -} OHCIPCIState; - static int usb_ohci_initfn_pci(struct PCIDevice *dev) { OHCIPCIState *ohci = DO_UPCAST(OHCIPCIState, pci_dev, dev);
On 05/31/2011 06:38 PM, Eduard - Gabriel Munteanu wrote:
Hi,
Again, sorry for taking so long, but I just don't send stuff without looking through it. This is meant to go into Michael's PCI branch, if it does.
Some of the changes include: - some fixes (one thanks to David Gibson) and cleanups - macro magic for exporting clones of the DMA interface (e.g. pci_memory_read()); I hope it isn't too much a stretch - we use pci_memory_*() in most places where PCI devices are involved now - luckily we don't need unaligned accesses anymore - some attempt at signaling target aborts, but it doesn't seem like that stuff is completely implemented in the PCI layer / devices - PCI ids are defined in hw/amd_iommu.c until they get merged into Linux
Also, I can't answer every request that the API is extended for doing this and that more comfortably. I understand there may be corner cases, but may I suggest merging it (maybe into a separate branch related to mst's pci) so that everybody can deal with it? This is still labeled RFC, but if you think it's ready it can be merged.
I hope most of the important issues have been dealt with. I'll post the SeaBIOS patches soon (though I think you can give it a spin with the old ones, if you need). I'll also take care of submitting PCI ids to be merged into Linux.
In any case, let me know what you think. I hope I didn't forget to Cc someone.
In order to move the discussion along productively, please have a look at
git://repo.or.cz/qemu/rth.git axp-iommu-1
which is based on your previous patch set.
There's stuff in there that's not 100% relevant to the discussion, but these two commits:
0652a74 target-alpha: Implement iommu translation for Typhoon. db50b11 DMA: Use an void* opaque value, rather than upcasting from qdev.
are exactly what I'm interested in discussing.
r~