[SeaBIOS] [Qemu-devel] [RFC PATCH v3 05/19] Implement dimm device abstraction

Vasilis Liaskovitis vasilis.liaskovitis at profitbricks.com
Wed Oct 24 19:16:36 CEST 2012


Hi,

On Wed, Oct 24, 2012 at 12:15:17PM +0200, Stefan Hajnoczi wrote:
> On Wed, Oct 24, 2012 at 10:06 AM, liu ping fan <qemulist at gmail.com> wrote:
> > On Tue, Oct 23, 2012 at 8:25 PM, Stefan Hajnoczi <stefanha at gmail.com> wrote:
> >> On Fri, Sep 21, 2012 at 01:17:21PM +0200, Vasilis Liaskovitis wrote:
> >>> +static void dimm_populate(DimmDevice *s)
> >>> +{
> >>> +    DeviceState *dev= (DeviceState*)s;
> >>> +    MemoryRegion *new = NULL;
> >>> +
> >>> +    new = g_malloc(sizeof(MemoryRegion));
> >>> +    memory_region_init_ram(new, dev->id, s->size);
> >>> +    vmstate_register_ram_global(new);
> >>> +    memory_region_add_subregion(get_system_memory(), s->start, new);
> >>> +    s->mr = new;
> >>> +}
> >>> +
> >>> +static void dimm_depopulate(DimmDevice *s)
> >>> +{
> >>> +    assert(s);
> >>> +    vmstate_unregister_ram(s->mr, NULL);
> >>> +    memory_region_del_subregion(get_system_memory(), s->mr);
> >>> +    memory_region_destroy(s->mr);
> >>> +    s->mr = NULL;
> >>> +}
> >>
> >> How is dimm hot unplug protected against callers who currently have RAM
> >> mapped (from cpu_physical_memory_map())?
> >>
> >> Emulated devices call cpu_physical_memory_map() directly or indirectly
> >> through DMA emulation code.  The RAM pointer may be held for arbitrary
> >> lengths of time, across main loop iterations, etc.
> >>
> >> It's not clear to me that it is safe to unplug a DIMM that has network
> >> or disk I/O buffers, for example.  We also need to be robust against
> >> malicious guests who abuse the hotplug lifecycle.  QEMU should never be
> >> left with dangling pointers.
> >>
> > Not sure about the block layer. But I think those thread are already
> > out of big lock, so there should be a MemoryListener to catch the
> > RAM-unplug event, and if needed, bdrv_flush.

do we want bdrv_flush, or some kind of cancel request e.g. bdrv_aio_cancel?

> 
> Here is the detailed scenario:
> 
> 1. Emulated device does cpu_physical_memory_map() and gets a pointer
> to guest RAM.
> 2. Return to vcpu or iothread, continue processing...
> 3. Hot unplug of RAM causes the guest RAM to disappear.
> 4. Pending I/O completes and overwrites memory from dangling guest RAM pointer.
> 
> Any I/O device that does zero-copy I/O in QEMU faces this problem:
>  * The block layer is affected.
>  * The net layer is unaffected because it doesn't do zero-copy tx/rx
> across returns to the main loop (#2 above).
>  * Not sure about other devices classes (e.g. USB).
> 
> How should the MemoryListener callback work?  For block I/O it may not
> be possible to cancel pending I/O asynchronously - if you try to
> cancel then your thread may block until the I/O completes.

e.g. paio_cancel does this?
is there already an API to asynchronously cancel all in flight operations in a
BlockDriverState? Afaict block_job_cancel refers to streaming jobs only and
doesn't help here.

Can we make the RAM unplug initiate async I/O cancellations, prevent further I/Os,
and only free the memory in a callback, after all DMA I/O to the associated memory
region has been cancelled or completed?

Also iiuc the MemoryListener should be registered from users of
cpu_physical_memory_map e.g. hw/virtio.c

By the way dimm_depopulate only frees the qemu memory on an ACPI _EJ request, which
means that a well-behaved guest will have already offlined the memory and is not
using it anymore. If the guest still uses the memory e.g. for a DMA buffer, the
logical memory offlining will fail and the _EJ/qemu memory freeing will never
happen.

But in theory a malicious acpi guest driver could trigger _EJ requests to do step
3 above. 

Or perhaps the backing block driver can finish an I/O request for a zero-copy
block device that the guest doesn't care for anymore? I 'll think about this a
bit more.

> Synchronous cancel behavior is not workable since it can lead to poor
> latency or hangs in the guest.

ok

thanks,

- Vasilis




More information about the SeaBIOS mailing list