On Wed, 2013-03-27 at 23:45 -0700, Nicholas A. Bellinger wrote:
On Wed, 2013-03-27 at 15:33 -0700, Nicholas A. Bellinger wrote:
On Wed, 2013-03-27 at 23:56 +0200, Michael S. Tsirkin wrote:
On Wed, Mar 27, 2013 at 02:31:27PM -0700, Nicholas A. Bellinger wrote:
<SNIP>
Adding a bit more detailed seabios PCI setup info, and make_bios_readonly_intel() debug output for Kevin & Co to review.
I still do not understand how this happened. Somehow a memory region was deleted after vhost_dev_start but before vhost_virtqueue_start was called?
Not sure..
To clarify, this is only happening during seabios setup+scan of virtio-scsi, and not during normal virtio_scsi LLD operation.
Can you set a breakpoint there and see please?
A bit more context here..
After debugging seabios this evening, I've isolated the spot where things begin to go south for vhost_verify_ring_mappings check()
Below are logs from qemu + seabios serial output mixed to (attempt) to demonstrate what's going on..
root@tifa:/usr/src# qemu-system-x86_64 -enable-kvm -smp 4 -m 2048 -serial file:/tmp/vhost-serial.txt -hda /usr/src/qemu-paolo.git/debian_squeeze_amd64_standard.qcow2 -device vhost-scsi-pci,wwpn=naa.600140579ad21088 qemu-system-x86_64: pci_add_option_rom: failed to find romfile "efi-e1000.rom" Calling ->region_add: section.size: 655360 vhost_set_memory: section: 0x7fff962c2580 section->size: 655360 add: 1 Calling ->region_add: section.size: 131072 Calling ->region_add: section.size: 131072 vhost_set_memory: section: 0x7fff962c2580 section->size: 131072 add: 1 Calling ->region_add: section.size: 131072 vhost_set_memory: section: 0x7fff962c2580 section->size: 131072 add: 1 Calling ->region_add: section.size: 2146435072 vhost_set_memory: section: 0x7fff962c2580 section->size: 2146435072 add: 1 Calling ->region_add: section.size: 4096 Calling ->region_add: section.size: 1024 Calling ->region_add: section.size: 1048576 Calling ->region_add: section.size: 262144 vhost_set_memory: section: 0x7fff962c2580 section->size: 262144 add: 1 vhost_scsi_init_pci Before virtio_init_pci virtio_init_pci: size: 60 virtio_init_pci: new size: 64 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 131072 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 32768 add: 1 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 98304 add: 1 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 32768 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 98304 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 65536 add: 1 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 65536 add: 1 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 65536 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 65536 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 98304 add: 1 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 32768 add: 1 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 98304 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 32768 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 131072 add: 1 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 131072 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 131072 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 163840 add: 1 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 98304 add: 1 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 163840 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 98304 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 196608 add: 1 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 65536 add: 1 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 196608 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 65536 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 2146435072 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 2146697216 add: 1 vhost_set_memory: section: 0x7fe2801f29f0 section->size: 65536 add: 1 vhost_set_memory: section: 0x7fe2801f2ab0 section->size: 65536 add: 0 vhost_set_memory: section: 0x7fe2801f2a70 section->size: 8388608 add: 1 Entering vhost_dev_start >>>>>>>>>>>>>>>>>>>>>. Before vhost_virtqueue_start >>>>>>>>>>>>>>>>>>>>>>>>>>. After vhost_virtqueue_start >>>>>>>>>>>>>>>>>>>>>>>>>>.
<Start seabios serial output from init virtio-scsi code>
=== PCI device probing === PCI probe PCI device 00:00.0 (vd=8086:1237 c=0600) PCI device 00:01.0 (vd=8086:7000 c=0601) PCI device 00:01.1 (vd=8086:7010 c=0101) PCI device 00:01.3 (vd=8086:7113 c=0680) PCI device 00:02.0 (vd=1013:00b8 c=0300) PCI device 00:03.0 (vd=8086:100e c=0200) PCI device 00:04.0 (vd=1af4:1004 c=0100) Found 7 PCI devices (max PCI bus is 00) === PCI new allocation pass #1 === PCI: check devices === PCI new allocation pass #2 === PCI: map device bdf=00:03.0 bar 1, addr 0000c000, size 00000040 [io] PCI: map device bdf=00:04.0 bar 0, addr 0000c040, size 00000040 [io] PCI: map device bdf=00:01.1 bar 4, addr 0000c080, size 00000010 [io] PCI: map device bdf=00:03.0 bar 0, addr febc0000, size 00020000 [mem] PCI: map device bdf=00:02.0 bar 6, addr febe0000, size 00010000 [mem] PCI: map device bdf=00:02.0 bar 1, addr febf0000, size 00001000 [mem] PCI: map device bdf=00:04.0 bar 1, addr febf1000, size 00001000 [mem] PCI: map device bdf=00:02.0 bar 0, addr fc000000, size 02000000 [prefmem] PCI: init bdf=00:00.0 id=8086:1237 PCI: init bdf=00:01.0 id=8086:7000 PIIX3/PIIX4 init: elcr=00 0c PCI: init bdf=00:01.1 id=8086:7010 PCI: init bdf=00:01.3 id=8086:7113 Using pmtimer, ioport 0xb008, freq 3579 kHz PCI: init bdf=00:02.0 id=1013:00b8 PCI: init bdf=00:03.0 id=8086:100e PCI: init bdf=00:04.0 id=1af4:1004
init virtio-scsi found virtio-scsi at 0:4
init virtio-scsi found virtio-scsi at 0:4 init_virtio_scsi: ioaddr: 0xc040
Searching bootorder for: /pci@i0cf8/*@4/*@0/*@0,0 Entering virtio_scsi_cmd: 0x12 |7ffdc000| ata0-0: QEMU HARDDISK ATA-7 Hard-Disk (16384 MiBytes) |7ffdc000| Searching bootorder for: /pci@i0cf8/*@1,1/drive@0/disk@0 |7ffdc000| Registering bootable: ata0-0: QEMU HARDDISK ATA-7 Hard-Disk (16384 MiBytes) (type:2 prio:101 data:f4ab0) \7ffdc000/ End thread |7ffdb000| DVD/CD [ata1-0: QEMU DVD-ROM ATAPI-4 DVD/CD] |7ffdb000| Searching bootorder for: /pci@i0cf8/*@1,1/drive@1/disk@0 |7ffdb000| Registering bootable: DVD/CD [ata1-0: QEMU DVD-ROM ATAPI-4 DVD/CD] (type:3 prio:103 data:f4a80) \7ffdb000/ End thread Searching bootorder for: /pci@i0cf8/*@4/*@0/*@1,0 Entering virtio_scsi_cmd: 0x12 virtio-scsi vendor='LIO-ORG' product='RAMDISK-MCP' rev='4.0' type=0 removable=0 Entering virtio_scsi_cmd: 0x00 Entering virtio_scsi_cmd: 0x25 virtio-scsi blksize=512 sectors=524288 Registering bootable: virtio-scsi Drive LIO-ORG RAMDISK-MCP 4.0 (type:2 prio:101 data:f4ae0) Searching bootorder for: /pci@i0cf8/*@4/*@0/*@2,0 Entering virtio_scsi_cmd: 0x12
<SNIP> target IDs up to 256...
init lsi53c895a init esp init megasas Scan for option roms Attempting to init PCI bdf 00:00.0 (vd 8086:1237) Attempting to init PCI bdf 00:01.0 (vd 8086:7000) Attempting to init PCI bdf 00:01.3 (vd 8086:7113) Attempting to init PCI bdf 00:03.0 (vd 8086:100e) Attempting to init PCI bdf 00:04.0 (vd 1af4:1004) Searching bootorder for: /rom@genroms/kvmvapic.bin Registering bootable: Legacy option rom (type:129 prio:101 data:c9000003) Before prepareboot >>>>>>>>>>>>>>>. Searching bootorder for: HALT Mapping hd drive 0x000f4ab0 to 0 drive 0x000f4ab0: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 s=33554432 Mapping hd drive 0x000f4ae0 to 1 drive 0x000f4ae0: PCHS=0/0/0 translation=lba LCHS=520/16/63 s=524288 Running option rom at c900:0003 Mapping floppy drive 0x000f4b40 Mapping cd drive 0x000f4a80 finalize PMM malloc finalize Space available for UMB: cb800-ec000, f4690-f4a50 Returned 57344 bytes of ZoneHigh e820 map has 7 items: 0: 0000000000000000 - 000000000009fc00 = 1 RAM 1: 000000000009fc00 - 00000000000a0000 = 2 RESERVED 2: 00000000000f0000 - 0000000000100000 = 2 RESERVED 3: 0000000000100000 - 000000007fffe000 = 1 RAM 4: 000000007fffe000 - 0000000080000000 = 2 RESERVED 5: 00000000feffc000 - 00000000ff000000 = 2 RESERVED 6: 00000000fffc0000 - 0000000100000000 = 2 RESERVED Before make_bios_readonly >>>>>>> locking shadow ram Calling pci_config_writeb(0x11): bdf: 0x0000 pam: 0x0000005a ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
locking shadow ram romend: 0x000cb800 romtop: 0x000ec000 mem: 0x000c0000, pam: 0x0000005a Calling pci_config_writeb(0x11): bdf: 0x0000 pam: 0x0000005a
<No QEMU output after pci_config_writeb(0x11) in make_bios_readonly..>
Calling pci_config_writeb(0x31): bdf: 0x0000 pam: 0x0000005b ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
mem: 0x000c8000, pam: 0x0000005b romend: 0x000cb800 mem + 16*1024: 0x000cc000 romtop: 0x000ec000 mem + 32*1024: 0x000d0000
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
romend: 0x000cb800, mem: 0x000c8000, romtop: 0x000ec000, mem + 16*1024: 0x000cc000 Calling pci_config_writeb(0x31): bdf: 0x0000 pam: 0x0000005b
<QEMU output after pci_config_writeb(0x31) in make_bios_readonly..>
vhost_set_memory: section: 0x7fe2801f2b60 section->size: 2146697216 add: 0 Before vhost_verify_ring_mappings: start_addr: c0000 size: 2146697216 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c0000 for vq 2 Unable to map ring buffer for ring 2 l: 4096 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 32768 add: 1 Before vhost_verify_ring_mappings: start_addr: c0000 size: 32768 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 2146664448 add: 1 Before vhost_verify_ring_mappings: start_addr: c8000 size: 2146664448 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c8000 for vq 2
<Seabios output>
Calling pci_config_writeb(0x10): bdf: 0x0000 pam0: 0x00000059 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
<<QEMU output for the pci_config_writeb(0x10) in make_bios_readonly..>
vhost_set_memory: section: 0x7fe2801f2b60 section->size: 32768 add: 0 Before vhost_verify_ring_mappings: start_addr: c0000 size: 32768 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 2146664448 add: 0 Before vhost_verify_ring_mappings: start_addr: c8000 size: 2146664448 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c8000 for vq 2 Unable to map ring buffer for ring 2 l: 0 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 36864 add: 1 Before vhost_verify_ring_mappings: start_addr: c0000 size: 36864 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 2146660352 add: 1 Before vhost_verify_ring_mappings: start_addr: c9000 size: 2146660352 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c9000 for vq 2
<Seabios calls into StartBoot, and the last bit of QEMU output>
vhost_set_memory: section: 0x7fe2801f2b60 section->size: 2146660352 add: 0 Before vhost_verify_ring_mappings: start_addr: c9000 size: 2146660352 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c9000 for vq 2 Unable to map ring buffer for ring 2 l: 0 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 159744 add: 1 Before vhost_verify_ring_mappings: start_addr: c9000 size: 159744 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c9000 for vq 2 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 65536 add: 1 Before vhost_verify_ring_mappings: start_addr: f0000 size: 65536 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 2146435072 add: 1 Before vhost_verify_ring_mappings: start_addr: 100000 size: 2146435072 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124
So not being a seabios expert, this is as far as I've gotten.. One change that does appear to avoid this behavior is when vp_reset() is called right after virtio_scsi_scan_target() occurs. (See below)
Obviously this prevents virtio-scsi root device operation, but it seems to be a hint that some left-over PCI config space is triggering the above in vhost_verify_ring_mappings().
WDYT..?
diff --git a/src/virtio-scsi.c b/src/virtio-scsi.c index 4de1255..cafefff 100644 --- a/src/virtio-scsi.c +++ b/src/virtio-scsi.c @@ -153,7 +155,10 @@ init_virtio_scsi(struct pci_device *pci) int i, tot; for (tot = 0, i = 0; i < 256; i++) tot += virtio_scsi_scan_target(pci, ioaddr, vq, i);
+#if 1
- /* vhost-scsi-pci needs an vp_reset here, otherwise bad things happen */
- vp_reset(ioaddr);
+#endif if (!tot) goto fail;
-- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 28, 2013 at 12:35:42AM -0700, Nicholas A. Bellinger wrote:
On Wed, 2013-03-27 at 23:45 -0700, Nicholas A. Bellinger wrote:
On Wed, 2013-03-27 at 15:33 -0700, Nicholas A. Bellinger wrote:
On Wed, 2013-03-27 at 23:56 +0200, Michael S. Tsirkin wrote:
On Wed, Mar 27, 2013 at 02:31:27PM -0700, Nicholas A. Bellinger wrote:
<SNIP>
Adding a bit more detailed seabios PCI setup info, and make_bios_readonly_intel() debug output for Kevin & Co to review.
I still do not understand how this happened. Somehow a memory region was deleted after vhost_dev_start but before vhost_virtqueue_start was called?
Not sure..
To clarify, this is only happening during seabios setup+scan of virtio-scsi, and not during normal virtio_scsi LLD operation.
Can you set a breakpoint there and see please?
A bit more context here..
After debugging seabios this evening, I've isolated the spot where things begin to go south for vhost_verify_ring_mappings check()
Below are logs from qemu + seabios serial output mixed to (attempt) to demonstrate what's going on..
root@tifa:/usr/src# qemu-system-x86_64 -enable-kvm -smp 4 -m 2048 -serial file:/tmp/vhost-serial.txt -hda /usr/src/qemu-paolo.git/debian_squeeze_amd64_standard.qcow2 -device vhost-scsi-pci,wwpn=naa.600140579ad21088 qemu-system-x86_64: pci_add_option_rom: failed to find romfile "efi-e1000.rom" Calling ->region_add: section.size: 655360 vhost_set_memory: section: 0x7fff962c2580 section->size: 655360 add: 1 Calling ->region_add: section.size: 131072 Calling ->region_add: section.size: 131072 vhost_set_memory: section: 0x7fff962c2580 section->size: 131072 add: 1 Calling ->region_add: section.size: 131072 vhost_set_memory: section: 0x7fff962c2580 section->size: 131072 add: 1 Calling ->region_add: section.size: 2146435072 vhost_set_memory: section: 0x7fff962c2580 section->size: 2146435072 add: 1 Calling ->region_add: section.size: 4096 Calling ->region_add: section.size: 1024 Calling ->region_add: section.size: 1048576 Calling ->region_add: section.size: 262144 vhost_set_memory: section: 0x7fff962c2580 section->size: 262144 add: 1 vhost_scsi_init_pci Before virtio_init_pci virtio_init_pci: size: 60 virtio_init_pci: new size: 64 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 131072 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 32768 add: 1 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 98304 add: 1 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 32768 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 98304 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 65536 add: 1 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 65536 add: 1 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 65536 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 65536 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 98304 add: 1 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 32768 add: 1 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 98304 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 32768 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 131072 add: 1 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 131072 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 131072 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 163840 add: 1 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 98304 add: 1 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 163840 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 98304 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 196608 add: 1 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 65536 add: 1 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 196608 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 65536 add: 0 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 2146435072 add: 0 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 2146697216 add: 1 vhost_set_memory: section: 0x7fe2801f29f0 section->size: 65536 add: 1 vhost_set_memory: section: 0x7fe2801f2ab0 section->size: 65536 add: 0 vhost_set_memory: section: 0x7fe2801f2a70 section->size: 8388608 add: 1 Entering vhost_dev_start >>>>>>>>>>>>>>>>>>>>>. Before vhost_virtqueue_start >>>>>>>>>>>>>>>>>>>>>>>>>>. After vhost_virtqueue_start >>>>>>>>>>>>>>>>>>>>>>>>>>.
<Start seabios serial output from init virtio-scsi code>
=== PCI device probing === PCI probe PCI device 00:00.0 (vd=8086:1237 c=0600) PCI device 00:01.0 (vd=8086:7000 c=0601) PCI device 00:01.1 (vd=8086:7010 c=0101) PCI device 00:01.3 (vd=8086:7113 c=0680) PCI device 00:02.0 (vd=1013:00b8 c=0300) PCI device 00:03.0 (vd=8086:100e c=0200) PCI device 00:04.0 (vd=1af4:1004 c=0100) Found 7 PCI devices (max PCI bus is 00) === PCI new allocation pass #1 === PCI: check devices === PCI new allocation pass #2 === PCI: map device bdf=00:03.0 bar 1, addr 0000c000, size 00000040 [io] PCI: map device bdf=00:04.0 bar 0, addr 0000c040, size 00000040 [io] PCI: map device bdf=00:01.1 bar 4, addr 0000c080, size 00000010 [io] PCI: map device bdf=00:03.0 bar 0, addr febc0000, size 00020000 [mem] PCI: map device bdf=00:02.0 bar 6, addr febe0000, size 00010000 [mem] PCI: map device bdf=00:02.0 bar 1, addr febf0000, size 00001000 [mem] PCI: map device bdf=00:04.0 bar 1, addr febf1000, size 00001000 [mem] PCI: map device bdf=00:02.0 bar 0, addr fc000000, size 02000000 [prefmem] PCI: init bdf=00:00.0 id=8086:1237 PCI: init bdf=00:01.0 id=8086:7000 PIIX3/PIIX4 init: elcr=00 0c PCI: init bdf=00:01.1 id=8086:7010 PCI: init bdf=00:01.3 id=8086:7113 Using pmtimer, ioport 0xb008, freq 3579 kHz PCI: init bdf=00:02.0 id=1013:00b8 PCI: init bdf=00:03.0 id=8086:100e PCI: init bdf=00:04.0 id=1af4:1004
init virtio-scsi found virtio-scsi at 0:4
init virtio-scsi found virtio-scsi at 0:4 init_virtio_scsi: ioaddr: 0xc040
Searching bootorder for: /pci@i0cf8/*@4/*@0/*@0,0 Entering virtio_scsi_cmd: 0x12 |7ffdc000| ata0-0: QEMU HARDDISK ATA-7 Hard-Disk (16384 MiBytes) |7ffdc000| Searching bootorder for: /pci@i0cf8/*@1,1/drive@0/disk@0 |7ffdc000| Registering bootable: ata0-0: QEMU HARDDISK ATA-7 Hard-Disk (16384 MiBytes) (type:2 prio:101 data:f4ab0) \7ffdc000/ End thread |7ffdb000| DVD/CD [ata1-0: QEMU DVD-ROM ATAPI-4 DVD/CD] |7ffdb000| Searching bootorder for: /pci@i0cf8/*@1,1/drive@1/disk@0 |7ffdb000| Registering bootable: DVD/CD [ata1-0: QEMU DVD-ROM ATAPI-4 DVD/CD] (type:3 prio:103 data:f4a80) \7ffdb000/ End thread Searching bootorder for: /pci@i0cf8/*@4/*@0/*@1,0 Entering virtio_scsi_cmd: 0x12 virtio-scsi vendor='LIO-ORG' product='RAMDISK-MCP' rev='4.0' type=0 removable=0 Entering virtio_scsi_cmd: 0x00 Entering virtio_scsi_cmd: 0x25 virtio-scsi blksize=512 sectors=524288 Registering bootable: virtio-scsi Drive LIO-ORG RAMDISK-MCP 4.0 (type:2 prio:101 data:f4ae0) Searching bootorder for: /pci@i0cf8/*@4/*@0/*@2,0 Entering virtio_scsi_cmd: 0x12
<SNIP> target IDs up to 256...
init lsi53c895a init esp init megasas Scan for option roms Attempting to init PCI bdf 00:00.0 (vd 8086:1237) Attempting to init PCI bdf 00:01.0 (vd 8086:7000) Attempting to init PCI bdf 00:01.3 (vd 8086:7113) Attempting to init PCI bdf 00:03.0 (vd 8086:100e) Attempting to init PCI bdf 00:04.0 (vd 1af4:1004) Searching bootorder for: /rom@genroms/kvmvapic.bin Registering bootable: Legacy option rom (type:129 prio:101 data:c9000003) Before prepareboot >>>>>>>>>>>>>>>. Searching bootorder for: HALT Mapping hd drive 0x000f4ab0 to 0 drive 0x000f4ab0: PCHS=16383/16/63 translation=lba LCHS=1024/255/63 s=33554432 Mapping hd drive 0x000f4ae0 to 1 drive 0x000f4ae0: PCHS=0/0/0 translation=lba LCHS=520/16/63 s=524288 Running option rom at c900:0003 Mapping floppy drive 0x000f4b40 Mapping cd drive 0x000f4a80 finalize PMM malloc finalize Space available for UMB: cb800-ec000, f4690-f4a50 Returned 57344 bytes of ZoneHigh e820 map has 7 items: 0: 0000000000000000 - 000000000009fc00 = 1 RAM 1: 000000000009fc00 - 00000000000a0000 = 2 RESERVED 2: 00000000000f0000 - 0000000000100000 = 2 RESERVED 3: 0000000000100000 - 000000007fffe000 = 1 RAM 4: 000000007fffe000 - 0000000080000000 = 2 RESERVED 5: 00000000feffc000 - 00000000ff000000 = 2 RESERVED 6: 00000000fffc0000 - 0000000100000000 = 2 RESERVED Before make_bios_readonly >>>>>>> locking shadow ram Calling pci_config_writeb(0x11): bdf: 0x0000 pam: 0x0000005a ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
locking shadow ram romend: 0x000cb800 romtop: 0x000ec000 mem: 0x000c0000, pam: 0x0000005a Calling pci_config_writeb(0x11): bdf: 0x0000 pam: 0x0000005a
<No QEMU output after pci_config_writeb(0x11) in make_bios_readonly..>
Calling pci_config_writeb(0x31): bdf: 0x0000 pam: 0x0000005b ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
mem: 0x000c8000, pam: 0x0000005b romend: 0x000cb800 mem + 16*1024: 0x000cc000 romtop: 0x000ec000 mem + 32*1024: 0x000d0000
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
romend: 0x000cb800, mem: 0x000c8000, romtop: 0x000ec000, mem + 16*1024: 0x000cc000 Calling pci_config_writeb(0x31): bdf: 0x0000 pam: 0x0000005b
<QEMU output after pci_config_writeb(0x31) in make_bios_readonly..>
vhost_set_memory: section: 0x7fe2801f2b60 section->size: 2146697216 add: 0 Before vhost_verify_ring_mappings: start_addr: c0000 size: 2146697216 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>.
This is also a bug. -net always initializes VQs 0..N so this is what vhost assumed. Please teach vhost that it should skip uninitialized VQs. There are more places to fix. Basically look for if (!virtio_queue_get_num(vdev, queue_no)), all of them need to be updated to skip uninitialized vqs. Probably switch to a new API checking PA too. See patch below.
Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c0000 for vq 2 Unable to map ring buffer for ring 2 l: 4096 ring_size: 5124
okay so the ring address is within ROM. Unsurprisingly it fails. bios should stop device before write protect.
vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 32768 add: 1 Before vhost_verify_ring_mappings: start_addr: c0000 size: 32768 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 2146664448 add: 1 Before vhost_verify_ring_mappings: start_addr: c8000 size: 2146664448 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c8000 for vq 2
<Seabios output>
Calling pci_config_writeb(0x10): bdf: 0x0000 pam0: 0x00000059 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
<<QEMU output for the pci_config_writeb(0x10) in make_bios_readonly..>
vhost_set_memory: section: 0x7fe2801f2b60 section->size: 32768 add: 0 Before vhost_verify_ring_mappings: start_addr: c0000 size: 32768 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2b60 section->size: 2146664448 add: 0 Before vhost_verify_ring_mappings: start_addr: c8000 size: 2146664448 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c8000 for vq 2 Unable to map ring buffer for ring 2 l: 0 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 36864 add: 1 Before vhost_verify_ring_mappings: start_addr: c0000 size: 36864 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 2146660352 add: 1 Before vhost_verify_ring_mappings: start_addr: c9000 size: 2146660352 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c9000 for vq 2
<Seabios calls into StartBoot, and the last bit of QEMU output>
vhost_set_memory: section: 0x7fe2801f2b60 section->size: 2146660352 add: 0 Before vhost_verify_ring_mappings: start_addr: c9000 size: 2146660352 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c9000 for vq 2 Unable to map ring buffer for ring 2 l: 0 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 159744 add: 1 Before vhost_verify_ring_mappings: start_addr: c9000 size: 159744 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c9000 for vq 2 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 65536 add: 1 Before vhost_verify_ring_mappings: start_addr: f0000 size: 65536 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124 vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 2146435072 add: 1 Before vhost_verify_ring_mappings: start_addr: 100000 size: 2146435072 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124
So not being a seabios expert, this is as far as I've gotten.. One change that does appear to avoid this behavior is when vp_reset() is called right after virtio_scsi_scan_target() occurs. (See below)
Obviously this prevents virtio-scsi root device operation, but it seems to be a hint that some left-over PCI config space is triggering the above in vhost_verify_ring_mappings().
WDYT..?
diff --git a/src/virtio-scsi.c b/src/virtio-scsi.c index 4de1255..cafefff 100644 --- a/src/virtio-scsi.c +++ b/src/virtio-scsi.c @@ -153,7 +155,10 @@ init_virtio_scsi(struct pci_device *pci) int i, tot; for (tot = 0, i = 0; i < 256; i++) tot += virtio_scsi_scan_target(pci, ioaddr, vq, i);
+#if 1
- /* vhost-scsi-pci needs an vp_reset here, otherwise bad things happen */
- vp_reset(ioaddr);
+#endif if (!tot) goto fail;
I think it's the right thing to do, but maybe not the right place to do this, need to reset after all IO is done, before ring memory is write protected.
-- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
----
virtio: add API to check that ring is setup
virtio scsi makes it legal to only setup a subset of rings. The only way to detect the ring is setup seems to be to check whether PA was written to. Add API to do this, and teach code to use it instead of checking hardware queue size.
Signed-off-by: Michael S. Tsirkin mst@redhat.com
--->
diff --git a/hw/virtio.c b/hw/virtio.c index 26fbc79..ac12c01 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -651,6 +651,11 @@ int virtio_queue_get_num(VirtIODevice *vdev, int n) return vdev->vq[n].vring.num; }
+bool virtio_queue_valid(VirtIODevice *vdev, int n) +{ + return vdev->vq[n].vring.num && vdev->vq[n].vring.pa; +} + int virtio_queue_get_id(VirtQueue *vq) { VirtIODevice *vdev = vq->vdev; diff --git a/hw/virtio.h b/hw/virtio.h index ca43fd7..d3f23c2 100644 --- a/hw/virtio.h +++ b/hw/virtio.h @@ -227,6 +227,7 @@ void virtio_config_writel(VirtIODevice *vdev, uint32_t addr, uint32_t data); void virtio_queue_set_addr(VirtIODevice *vdev, int n, hwaddr addr); hwaddr virtio_queue_get_addr(VirtIODevice *vdev, int n); int virtio_queue_get_num(VirtIODevice *vdev, int n); +bool virtio_queue_valid(VirtIODevice *vdev, int n); void virtio_queue_notify(VirtIODevice *vdev, int n); uint16_t virtio_queue_vector(VirtIODevice *vdev, int n); void virtio_queue_set_vector(VirtIODevice *vdev, int n, uint16_t vector);
Il 28/03/2013 10:04, Michael S. Tsirkin ha scritto:
Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c0000 for vq 2 Unable to map ring buffer for ring 2 l: 4096 ring_size: 5124
okay so the ring address is within ROM. Unsurprisingly it fails. bios should stop device before write protect.
The above log is very early, when everything is RAM:
vhost_set_memory: section: 0x7fe2801f2b60 section->size: 2146697216 add: 0 Before vhost_verify_ring_mappings: start_addr: c0000 size: 2146697216
The rings are not within ROM. ROM is at 0xc0000-0xcc000 according to the PAM registers.
The way I followed the debug output, "Got ranges_overlap" means actually "bailing out because ranges do not overlap". In particular, here all three virtqueues fail the test, because this is the ROM area 0xc0000..0xc7fff:
vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 32768 add: 1 Before vhost_verify_ring_mappings: start_addr: c0000 size: 32768 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124
Just below, vhost looks at the large RAM area starting at 0xc8000 (it's large because 0xf0000..0xfffff is still RAM):
vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 2146664448 add: 1 Before vhost_verify_ring_mappings: start_addr: c8000 size: 2146664448 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c8000 for vq 2
Here vq 0 and 1 fail the test because they are in low RAM, vq 2 passes.
After 0xf0000..0xfffff is marked readonly, vhost looks at the RAM between 0xc9000 and 0xf0000:
vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 159744 add: 1 Before vhost_verify_ring_mappings: start_addr: c9000 size: 159744 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c9000 for vq 2
and the ROM between 0xf0000 and 0xfffff, which no ring overlaps with:
vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 65536 add: 1 Before vhost_verify_ring_mappings: start_addr: f0000 size: 65536 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124
SeaBIOS is indeed not initializing vqs 0/1 (the control and event queues), so their ring_phys is 0. But the one that is failing is vq 2, the first request queue.
Your patch seems good, but shouldn't fix this problem.
Paolo
On Thu, 2013-03-28 at 11:03 +0100, Paolo Bonzini wrote:
Il 28/03/2013 10:04, Michael S. Tsirkin ha scritto:
Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c0000 for vq 2 Unable to map ring buffer for ring 2 l: 4096 ring_size: 5124
okay so the ring address is within ROM. Unsurprisingly it fails. bios should stop device before write protect.
The above log is very early, when everything is RAM:
vhost_set_memory: section: 0x7fe2801f2b60 section->size: 2146697216 add: 0 Before vhost_verify_ring_mappings: start_addr: c0000 size: 2146697216
The rings are not within ROM. ROM is at 0xc0000-0xcc000 according to the PAM registers.
The way I followed the debug output, "Got ranges_overlap" means actually "bailing out because ranges do not overlap".
Yes, this is when !ranges_overlap() is hit in vhost_verify_ring_mappings(), so the offending cpu_physical_memory_map() is skipped..
In particular, here all three virtqueues fail the test, because this is the ROM area 0xc0000..0xc7fff:
vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 32768 add: 1 Before vhost_verify_ring_mappings: start_addr: c0000 size: 32768 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124
Just below, vhost looks at the large RAM area starting at 0xc8000 (it's large because 0xf0000..0xfffff is still RAM):
vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 2146664448 add: 1 Before vhost_verify_ring_mappings: start_addr: c8000 size: 2146664448 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c8000 for vq 2
Here vq 0 and 1 fail the test because they are in low RAM, vq 2 passes.
After 0xf0000..0xfffff is marked readonly,
Btw, the first vhost_set_memory() and failing vhost_verify_ring_mappings() do not occur until the pci_config_writeb(..., 0x31) code is executed in src/shadow.c:make_bios_readonly_intel() below:
static void make_bios_readonly_intel(u16 bdf, u32 pam0) { // Flush any pending writes before locking memory. wbinvd();
// Write protect roms from 0xc0000-0xf0000 u32 romend = rom_get_last(), romtop = rom_get_max(); int i; for (i=0; i<6; i++) { u32 mem = BUILD_ROM_START + i * 32*1024; u32 pam = pam0 + 1 + i; if (romend <= mem + 16*1024 || romtop <= mem + 32*1024) { if (romend > mem && romtop > mem + 16*1024) pci_config_writeb(bdf, pam, 0x31); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
break; } pci_config_writeb(bdf, pam, 0x11); }
// Write protect 0xf0000-0x100000 pci_config_writeb(bdf, pam0, 0x10); }
Up until this point, vhost_verify_ring_mappings() is not called by vhost_set_memory() as vhost_dev_start() has not been invoked to set vdev->started yet..
vhost looks at the RAM between 0xc9000 and 0xf0000:
vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 159744 add: 1 Before vhost_verify_ring_mappings: start_addr: c9000 size: 159744 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c9000 for vq 2
and the ROM between 0xf0000 and 0xfffff, which no ring overlaps with:
vhost_set_memory: section: 0x7fe2801f2aa0 section->size: 65536 add: 1 Before vhost_verify_ring_mappings: start_addr: f0000 size: 65536 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 2 ring_phys: ed000 ring_size: 5124
SeaBIOS is indeed not initializing vqs 0/1 (the control and event queues), so their ring_phys is 0. But the one that is failing is vq 2, the first request queue.
Your patch seems good, but shouldn't fix this problem.
Paolo
To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I think it's the right thing to do, but maybe not the right place to do this, need to reset after all IO is done, before ring memory is write protected.
Our emails are crossing each other unfortunately, but I want to reinforce this: ring memory is not write protected. Remember that SeaBIOS can even provide virtio-scsi access to DOS, so you must not reset the device. It must remain functional all the time, and the OS's own driver will reset it when it's started.
Paolo
On Thu, 2013-03-28 at 06:13 -0400, Paolo Bonzini wrote:
I think it's the right thing to do, but maybe not the right place to do this, need to reset after all IO is done, before ring memory is write protected.
Our emails are crossing each other unfortunately, but I want to reinforce this: ring memory is not write protected.
Understood. However, AFAICT the act of write protecting these ranges for ROM generates the offending callbacks to vhost_set_memory().
The part that I'm missing is if ring memory is not being write protected by make_bios_readonly_intel(), why are the vhost_set_memory() calls being invoked..?
Remember that SeaBIOS can even provide virtio-scsi access to DOS, so you must not reset the device. It must remain functional all the time, and the OS's own driver will reset it when it's started.
Mmmm, so a vp_reset() is out of the question then..
--nab
Il 29/03/2013 03:53, Nicholas A. Bellinger ha scritto:
On Thu, 2013-03-28 at 06:13 -0400, Paolo Bonzini wrote:
I think it's the right thing to do, but maybe not the right place to do this, need to reset after all IO is done, before ring memory is write protected.
Our emails are crossing each other unfortunately, but I want to reinforce this: ring memory is not write protected.
Understood. However, AFAICT the act of write protecting these ranges for ROM generates the offending callbacks to vhost_set_memory().
The part that I'm missing is if ring memory is not being write protected by make_bios_readonly_intel(), why are the vhost_set_memory() calls being invoked..?
Because mappings change for the region that contains the ring. vhost doesn't know yet that the changes do not affect ring memory, vhost_set_memory() is called exactly to ascertain that.
Paolo
Remember that SeaBIOS can even provide virtio-scsi access to DOS, so you must not reset the device. It must remain functional all the time, and the OS's own driver will reset it when it's started.
Mmmm, so a vp_reset() is out of the question then..
--nab
-- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2013-03-29 at 09:14 +0100, Paolo Bonzini wrote:
Il 29/03/2013 03:53, Nicholas A. Bellinger ha scritto:
On Thu, 2013-03-28 at 06:13 -0400, Paolo Bonzini wrote:
I think it's the right thing to do, but maybe not the right place to do this, need to reset after all IO is done, before ring memory is write protected.
Our emails are crossing each other unfortunately, but I want to reinforce this: ring memory is not write protected.
Understood. However, AFAICT the act of write protecting these ranges for ROM generates the offending callbacks to vhost_set_memory().
The part that I'm missing is if ring memory is not being write protected by make_bios_readonly_intel(), why are the vhost_set_memory() calls being invoked..?
Because mappings change for the region that contains the ring. vhost doesn't know yet that the changes do not affect ring memory, vhost_set_memory() is called exactly to ascertain that.
Hi Paolo & Co,
Here's a bit more information on what is going on with the same cpu_physical_memory_map() failure in vhost_verify_ring_mappings()..
So as before, at the point that seabios is marking memory as readonly for ROM in src/shadow.c:make_bios_readonly_intel() with the following call:
Calling pci_config_writeb(0x31): bdf: 0x0000 pam: 0x0000005b
the memory API update hook triggers back into vhost_region_del() code, and following occurs:
Entering vhost_region_del section: 0x7fd30a213b60 offset_within_region: 0xc0000 size: 2146697216 readonly: 0 vhost_region_del: is_rom: 0, rom_device: 0 vhost_region_del: readable: 1 vhost_region_del: ram_addr 0x0, addr: 0x0 size: 2147483648 vhost_region_del: name: pc.ram Entering vhost_set_memory, section: 0x7fd30a213b60 add: 0, dev->started: 1 Entering verify_ring_mappings: start_addr 0x00000000000c0000 size: 2146697216 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0xed000 ring_size: 5124 verify_ring_mappings: calling cpu_physical_memory_map ring_phys: 0xed000 l: 5124 address_space_map: addr: 0xed000, plen: 5124 address_space_map: l: 4096, len: 5124 phys_page_find got PHYS_MAP_NODE_NIL >>>>>>>>>>>>>>>>>>>>>>.. address_space_map: section: 0x7fd30fabaed0 memory_region_is_ram: 0 readonly: 0 address_space_map: section: 0x7fd30fabaed0 offset_within_region: 0x0 section size: 18446744073709551615 Unable to map ring buffer for ring 2, l: 4096
So the interesting part is that phys_page_find() is not able to locate the corresponding page for vq->ring_phys: 0xed000 from the vhost_region_del() callback with section->offset_within_region: 0xc0000..
Is there any case where this would not be considered a bug..?
register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 Entering vhost_region_add section: 0x7fd30a213aa0 offset_within_region: 0xc0000 size: 32768 readonly: 1 vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7fd30a213aa0 add: 1, dev->started: 1 Entering verify_ring_mappings: start_addr 0x00000000000c0000 size: 32768 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0xed000 ring_size: 5124 verify_ring_mappings: Got !ranges_overlap, skipping register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 Entering vhost_region_add section: 0x7fd30a213aa0 offset_within_region: 0xc8000 size: 2146664448 readonly: 0 vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7fd30a213aa0 add: 1, dev->started: 1 Entering verify_ring_mappings: start_addr 0x00000000000c8000 size: 2146664448 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0xed000 ring_size: 5124 verify_ring_mappings: calling cpu_physical_memory_map ring_phys: 0xed000 l: 5124 address_space_map: addr: 0xed000, plen: 5124 address_space_map: l: 4096, len: 5124 address_space_map: section: 0x7fd30fabb020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7fd30fabb020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: l: 4096, len: 1028 address_space_map: section: 0x7fd30fabb020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7fd30fabb020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: Calling qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124 address_space_map: After qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124
So here the vhost_region_add() callback for section->offset_within_region: 0xc8000 for vq->ring_phys: 0xed000 is able to locate *section via phys_page_find() within address_space_map(), and cpu_physical_memory_map() completes as expected..
register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 phys_page_find got PHYS_MAP_NODE_NIL >>>>>>>>>>>>>>>>>>>>>>..
So while plodding my way through the memory API, the thing that would be useful to know is if the offending *section that is missing for the first phys_page_find() call is getting removed before the callback makes it's way into vhost_verify_ring_mappings() code, or that some other bug is occuring..?
Any idea on how this could be verified..?
Thanks,
--nab
On Mon, Apr 01, 2013 at 06:05:47PM -0700, Nicholas A. Bellinger wrote:
On Fri, 2013-03-29 at 09:14 +0100, Paolo Bonzini wrote:
Il 29/03/2013 03:53, Nicholas A. Bellinger ha scritto:
On Thu, 2013-03-28 at 06:13 -0400, Paolo Bonzini wrote:
I think it's the right thing to do, but maybe not the right place to do this, need to reset after all IO is done, before ring memory is write protected.
Our emails are crossing each other unfortunately, but I want to reinforce this: ring memory is not write protected.
Understood. However, AFAICT the act of write protecting these ranges for ROM generates the offending callbacks to vhost_set_memory().
The part that I'm missing is if ring memory is not being write protected by make_bios_readonly_intel(), why are the vhost_set_memory() calls being invoked..?
Because mappings change for the region that contains the ring. vhost doesn't know yet that the changes do not affect ring memory, vhost_set_memory() is called exactly to ascertain that.
Hi Paolo & Co,
Here's a bit more information on what is going on with the same cpu_physical_memory_map() failure in vhost_verify_ring_mappings()..
So as before, at the point that seabios is marking memory as readonly for ROM in src/shadow.c:make_bios_readonly_intel() with the following call:
Calling pci_config_writeb(0x31): bdf: 0x0000 pam: 0x0000005b
the memory API update hook triggers back into vhost_region_del() code, and following occurs:
Entering vhost_region_del section: 0x7fd30a213b60 offset_within_region: 0xc0000 size: 2146697216 readonly: 0 vhost_region_del: is_rom: 0, rom_device: 0 vhost_region_del: readable: 1 vhost_region_del: ram_addr 0x0, addr: 0x0 size: 2147483648 vhost_region_del: name: pc.ram Entering vhost_set_memory, section: 0x7fd30a213b60 add: 0, dev->started: 1 Entering verify_ring_mappings: start_addr 0x00000000000c0000 size: 2146697216 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0xed000 ring_size: 5124 verify_ring_mappings: calling cpu_physical_memory_map ring_phys: 0xed000 l: 5124 address_space_map: addr: 0xed000, plen: 5124 address_space_map: l: 4096, len: 5124 phys_page_find got PHYS_MAP_NODE_NIL >>>>>>>>>>>>>>>>>>>>>>.. address_space_map: section: 0x7fd30fabaed0 memory_region_is_ram: 0 readonly: 0 address_space_map: section: 0x7fd30fabaed0 offset_within_region: 0x0 section size: 18446744073709551615 Unable to map ring buffer for ring 2, l: 4096
So the interesting part is that phys_page_find() is not able to locate the corresponding page for vq->ring_phys: 0xed000 from the vhost_region_del() callback with section->offset_within_region: 0xc0000..
Is there any case where this would not be considered a bug..?
register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 Entering vhost_region_add section: 0x7fd30a213aa0 offset_within_region: 0xc0000 size: 32768 readonly: 1 vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7fd30a213aa0 add: 1, dev->started: 1 Entering verify_ring_mappings: start_addr 0x00000000000c0000 size: 32768 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0xed000 ring_size: 5124 verify_ring_mappings: Got !ranges_overlap, skipping register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 Entering vhost_region_add section: 0x7fd30a213aa0 offset_within_region: 0xc8000 size: 2146664448 readonly: 0 vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7fd30a213aa0 add: 1, dev->started: 1 Entering verify_ring_mappings: start_addr 0x00000000000c8000 size: 2146664448 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0xed000 ring_size: 5124 verify_ring_mappings: calling cpu_physical_memory_map ring_phys: 0xed000 l: 5124 address_space_map: addr: 0xed000, plen: 5124 address_space_map: l: 4096, len: 5124 address_space_map: section: 0x7fd30fabb020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7fd30fabb020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: l: 4096, len: 1028 address_space_map: section: 0x7fd30fabb020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7fd30fabb020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: Calling qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124 address_space_map: After qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124
So here the vhost_region_add() callback for section->offset_within_region: 0xc8000 for vq->ring_phys: 0xed000 is able to locate *section via phys_page_find() within address_space_map(), and cpu_physical_memory_map() completes as expected..
register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 phys_page_find got PHYS_MAP_NODE_NIL >>>>>>>>>>>>>>>>>>>>>>..
So while plodding my way through the memory API, the thing that would be useful to know is if the offending *section that is missing for the first phys_page_find() call is getting removed before the callback makes it's way into vhost_verify_ring_mappings() code, or that some other bug is occuring..?
Any idea on how this could be verified..?
Thanks,
--nab
Is it possible that what is going on here, is that we had a region at address 0x0 size 0x80000000, and now a chunk from it is being made readonly, and to this end the whole old region is removed then new ones are added?
If yes maybe the problem is that we don't use the atomic begin/commit ops in the memory API. Maybe the following will help? Completely untested, posting just to give you the idea:
Signed-off-by: Michael S. Tsirkin mst@redhat.com
---
diff --git a/hw/vhost.c b/hw/vhost.c index 4d6aee3..716cfaa 100644 --- a/hw/vhost.c +++ b/hw/vhost.c @@ -416,6 +416,25 @@ static void vhost_set_memory(MemoryListener *listener, /* Remove old mapping for this memory, if any. */ vhost_dev_unassign_memory(dev, start_addr, size); } + dev->memory_changed = true; +} + +static bool vhost_section(MemoryRegionSection *section) +{ + return memory_region_is_ram(section->mr); +} + +static void vhost_begin(MemoryListener *listener) +{ +} + +static void vhost_commit(MemoryListener *listener) +{ + struct vhost_dev *dev = container_of(listener, struct vhost_dev, + memory_listener); + if (!dev->memory_changed) { + return; + }
if (!dev->started) { return; @@ -445,19 +464,7 @@ static void vhost_set_memory(MemoryListener *listener, if (dev->log_size > log_size + VHOST_LOG_BUFFER) { vhost_dev_log_resize(dev, log_size); } -} - -static bool vhost_section(MemoryRegionSection *section) -{ - return memory_region_is_ram(section->mr); -} - -static void vhost_begin(MemoryListener *listener) -{ -} - -static void vhost_commit(MemoryListener *listener) -{ + dev->memory_changed = false; }
static void vhost_region_add(MemoryListener *listener, @@ -842,6 +849,7 @@ int vhost_dev_init(struct vhost_dev *hdev, int devfd, const char *devpath, hdev->log_size = 0; hdev->log_enabled = false; hdev->started = false; + hdev->memory_changed = false; memory_listener_register(&hdev->memory_listener, &address_space_memory); hdev->force = force; return 0;
On Tue, 2013-04-02 at 16:27 +0300, Michael S. Tsirkin wrote:
On Mon, Apr 01, 2013 at 06:05:47PM -0700, Nicholas A. Bellinger wrote:
On Fri, 2013-03-29 at 09:14 +0100, Paolo Bonzini wrote:
Il 29/03/2013 03:53, Nicholas A. Bellinger ha scritto:
On Thu, 2013-03-28 at 06:13 -0400, Paolo Bonzini wrote:
I think it's the right thing to do, but maybe not the right place to do this, need to reset after all IO is done, before ring memory is write protected.
Our emails are crossing each other unfortunately, but I want to reinforce this: ring memory is not write protected.
Understood. However, AFAICT the act of write protecting these ranges for ROM generates the offending callbacks to vhost_set_memory().
The part that I'm missing is if ring memory is not being write protected by make_bios_readonly_intel(), why are the vhost_set_memory() calls being invoked..?
Because mappings change for the region that contains the ring. vhost doesn't know yet that the changes do not affect ring memory, vhost_set_memory() is called exactly to ascertain that.
Hi Paolo & Co,
Here's a bit more information on what is going on with the same cpu_physical_memory_map() failure in vhost_verify_ring_mappings()..
So as before, at the point that seabios is marking memory as readonly for ROM in src/shadow.c:make_bios_readonly_intel() with the following call:
Calling pci_config_writeb(0x31): bdf: 0x0000 pam: 0x0000005b
the memory API update hook triggers back into vhost_region_del() code, and following occurs:
Entering vhost_region_del section: 0x7fd30a213b60 offset_within_region: 0xc0000 size: 2146697216 readonly: 0 vhost_region_del: is_rom: 0, rom_device: 0 vhost_region_del: readable: 1 vhost_region_del: ram_addr 0x0, addr: 0x0 size: 2147483648 vhost_region_del: name: pc.ram Entering vhost_set_memory, section: 0x7fd30a213b60 add: 0, dev->started: 1 Entering verify_ring_mappings: start_addr 0x00000000000c0000 size: 2146697216 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0xed000 ring_size: 5124 verify_ring_mappings: calling cpu_physical_memory_map ring_phys: 0xed000 l: 5124 address_space_map: addr: 0xed000, plen: 5124 address_space_map: l: 4096, len: 5124 phys_page_find got PHYS_MAP_NODE_NIL >>>>>>>>>>>>>>>>>>>>>>.. address_space_map: section: 0x7fd30fabaed0 memory_region_is_ram: 0 readonly: 0 address_space_map: section: 0x7fd30fabaed0 offset_within_region: 0x0 section size: 18446744073709551615 Unable to map ring buffer for ring 2, l: 4096
So the interesting part is that phys_page_find() is not able to locate the corresponding page for vq->ring_phys: 0xed000 from the vhost_region_del() callback with section->offset_within_region: 0xc0000..
Is there any case where this would not be considered a bug..?
register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 Entering vhost_region_add section: 0x7fd30a213aa0 offset_within_region: 0xc0000 size: 32768 readonly: 1 vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7fd30a213aa0 add: 1, dev->started: 1 Entering verify_ring_mappings: start_addr 0x00000000000c0000 size: 32768 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0xed000 ring_size: 5124 verify_ring_mappings: Got !ranges_overlap, skipping register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 Entering vhost_region_add section: 0x7fd30a213aa0 offset_within_region: 0xc8000 size: 2146664448 readonly: 0 vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7fd30a213aa0 add: 1, dev->started: 1 Entering verify_ring_mappings: start_addr 0x00000000000c8000 size: 2146664448 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0xed000 ring_size: 5124 verify_ring_mappings: calling cpu_physical_memory_map ring_phys: 0xed000 l: 5124 address_space_map: addr: 0xed000, plen: 5124 address_space_map: l: 4096, len: 5124 address_space_map: section: 0x7fd30fabb020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7fd30fabb020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: l: 4096, len: 1028 address_space_map: section: 0x7fd30fabb020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7fd30fabb020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: Calling qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124 address_space_map: After qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124
So here the vhost_region_add() callback for section->offset_within_region: 0xc8000 for vq->ring_phys: 0xed000 is able to locate *section via phys_page_find() within address_space_map(), and cpu_physical_memory_map() completes as expected..
register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 register_multipage : d: 0x7fd30f7d0ed0 section: 0x7fd30a2139b0 phys_page_find got PHYS_MAP_NODE_NIL >>>>>>>>>>>>>>>>>>>>>>..
So while plodding my way through the memory API, the thing that would be useful to know is if the offending *section that is missing for the first phys_page_find() call is getting removed before the callback makes it's way into vhost_verify_ring_mappings() code, or that some other bug is occuring..?
Any idea on how this could be verified..?
Thanks,
--nab
Is it possible that what is going on here, is that we had a region at address 0x0 size 0x80000000, and now a chunk from it is being made readonly, and to this end the whole old region is removed then new ones are added?
Yes, I believe this is exactly what is happening..
If yes maybe the problem is that we don't use the atomic begin/commit ops in the memory API. Maybe the following will help? Completely untested, posting just to give you the idea:
Mmmm, one question on how vhost_region_del() + vhost_region_add() + vhost_commit() should work..
Considering the following when the same seabios code snippet:
pci_config_writeb(0x31): bdf: 0x0000 pam: 0x0000005b
is executed to mark an pc.ram area 0xc0000 as readonly:
Entering vhost_begin >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Entering vhost_region_del section: 0x7fd037a4bb60 offset_within_region: 0xc0000 size: 2146697216 readonly: 0 vhost_region_del: is_rom: 0, rom_device: 0 vhost_region_del: readable: 1 vhost_region_del: ram_addr 0x0, addr: 0x0 size: 2147483648 vhost_region_del: name: pc.ram Entering vhost_set_memory, section: 0x7fd037a4bb60 add: 0, dev->started: 1 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc0000 Entering vhost_region_add section: 0x7fd037a4baa0 offset_within_region: 0xc0000 size: 32768 readonly: 1 vhost_region_add is readonly !!!!!!!!!!!!!!!!!!! vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7fd037a4baa0 add: 1, dev->started: 1 vhost_dev_assign_memory(); >>>>>>>>>>>>>>>>>>>>>>>>>>>> reg->guest_phys_addr: 0xc0000 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc0000 Entering vhost_region_add section: 0x7fd037a4baa0 offset_within_region: 0xc8000 size: 2146664448 readonly: 0 vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7fd037a4baa0 add: 1, dev->started: 1 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc8000 phys_page_find got PHYS_MAP_NODE_NIL >>>>>>>>>>>>>>>>>>>>>>.. Entering vhost_commit >>>>>>>>>>>>>>>>>>>>>>>>>>>
Note that originally we'd see the cpu_physical_memory_map() failure in vhost_verify_ring_mappings() after the first ->region_del() above.
Adding a hardcoded cpu_physical_memory_map() testcase in vhost_commit() for phys_addr=0xed000, len=5124 (vq ring) does locate the correct *section from address_space_map(), which correct points to the section generated by the last vhost_region_add() above:
Entering vhost_commit >>>>>>>>>>>>>>>>>>>>>>>>>>> address_space_map: addr: 0xed000, plen: 5124 address_space_map: l: 4096, len: 5124 address_space_map: section: 0x7f41b325f020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7f41b325f020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: l: 4096, len: 1028 address_space_map: section: 0x7f41b325f020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7f41b325f020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: Calling qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124 address_space_map: After qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124 cpu_physical_memory_map(0xed000) got l: 5124
So, does using a ->commit callback for MemoryListener mean that vhost_verify_ring_mappings() is OK to be called only from the final ->commit callback, and not from each ->region_del + ->region_add callback..? Eg: I seem to recall something about vhost_verify_ring_mappings() being called during each ->region_del() when dev->started == true was important, no..?
If this OK, then it seems a matter of keeping an updated bit for each of the regions in vhost_dev->mem_sections[] and performing the vhost_verify_ring_mappings() on all three above during the final ->commit() call, right..?
WDYT..?
--nab
Signed-off-by: Michael S. Tsirkin mst@redhat.com
diff --git a/hw/vhost.c b/hw/vhost.c index 4d6aee3..716cfaa 100644 --- a/hw/vhost.c +++ b/hw/vhost.c @@ -416,6 +416,25 @@ static void vhost_set_memory(MemoryListener *listener, /* Remove old mapping for this memory, if any. */ vhost_dev_unassign_memory(dev, start_addr, size); }
- dev->memory_changed = true;
+}
+static bool vhost_section(MemoryRegionSection *section) +{
- return memory_region_is_ram(section->mr);
+}
+static void vhost_begin(MemoryListener *listener) +{ +}
+static void vhost_commit(MemoryListener *listener) +{
struct vhost_dev *dev = container_of(listener, struct vhost_dev,
memory_listener);
if (!dev->memory_changed) {
return;
}
if (!dev->started) { return;
@@ -445,19 +464,7 @@ static void vhost_set_memory(MemoryListener *listener, if (dev->log_size > log_size + VHOST_LOG_BUFFER) { vhost_dev_log_resize(dev, log_size); } -}
-static bool vhost_section(MemoryRegionSection *section) -{
- return memory_region_is_ram(section->mr);
-}
-static void vhost_begin(MemoryListener *listener) -{ -}
-static void vhost_commit(MemoryListener *listener) -{
- dev->memory_changed = false;
}
static void vhost_region_add(MemoryListener *listener, @@ -842,6 +849,7 @@ int vhost_dev_init(struct vhost_dev *hdev, int devfd, const char *devpath, hdev->log_size = 0; hdev->log_enabled = false; hdev->started = false;
- hdev->memory_changed = false; memory_listener_register(&hdev->memory_listener, &address_space_memory); hdev->force = force; return 0;
-- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2013-04-02 at 21:04 -0700, Nicholas A. Bellinger wrote:
On Tue, 2013-04-02 at 16:27 +0300, Michael S. Tsirkin wrote:
On Mon, Apr 01, 2013 at 06:05:47PM -0700, Nicholas A. Bellinger wrote:
On Fri, 2013-03-29 at 09:14 +0100, Paolo Bonzini wrote:
Il 29/03/2013 03:53, Nicholas A. Bellinger ha scritto:
On Thu, 2013-03-28 at 06:13 -0400, Paolo Bonzini wrote:
> I think it's the right thing to do, but maybe not the right place > to do this, need to reset after all IO is done, before > ring memory is write protected.
Our emails are crossing each other unfortunately, but I want to reinforce this: ring memory is not write protected.
Understood. However, AFAICT the act of write protecting these ranges for ROM generates the offending callbacks to vhost_set_memory().
The part that I'm missing is if ring memory is not being write protected by make_bios_readonly_intel(), why are the vhost_set_memory() calls being invoked..?
Because mappings change for the region that contains the ring. vhost doesn't know yet that the changes do not affect ring memory, vhost_set_memory() is called exactly to ascertain that.
<SNIP>
Is it possible that what is going on here, is that we had a region at address 0x0 size 0x80000000, and now a chunk from it is being made readonly, and to this end the whole old region is removed then new ones are added?
Yes, I believe this is exactly what is happening..
If yes maybe the problem is that we don't use the atomic begin/commit ops in the memory API. Maybe the following will help? Completely untested, posting just to give you the idea:
Mmmm, one question on how vhost_region_del() + vhost_region_add() + vhost_commit() should work..
Considering the following when the same seabios code snippet:
pci_config_writeb(0x31): bdf: 0x0000 pam: 0x0000005b
is executed to mark an pc.ram area 0xc0000 as readonly:
Entering vhost_begin >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Entering vhost_region_del section: 0x7fd037a4bb60 offset_within_region: 0xc0000 size: 2146697216 readonly: 0 vhost_region_del: is_rom: 0, rom_device: 0 vhost_region_del: readable: 1 vhost_region_del: ram_addr 0x0, addr: 0x0 size: 2147483648 vhost_region_del: name: pc.ram Entering vhost_set_memory, section: 0x7fd037a4bb60 add: 0, dev->started: 1 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc0000 Entering vhost_region_add section: 0x7fd037a4baa0 offset_within_region: 0xc0000 size: 32768 readonly: 1 vhost_region_add is readonly !!!!!!!!!!!!!!!!!!! vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7fd037a4baa0 add: 1, dev->started: 1 vhost_dev_assign_memory(); >>>>>>>>>>>>>>>>>>>>>>>>>>>> reg->guest_phys_addr: 0xc0000 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc0000 Entering vhost_region_add section: 0x7fd037a4baa0 offset_within_region: 0xc8000 size: 2146664448 readonly: 0 vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7fd037a4baa0 add: 1, dev->started: 1 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc8000 phys_page_find got PHYS_MAP_NODE_NIL >>>>>>>>>>>>>>>>>>>>>>.. Entering vhost_commit >>>>>>>>>>>>>>>>>>>>>>>>>>>
Note that originally we'd see the cpu_physical_memory_map() failure in vhost_verify_ring_mappings() after the first ->region_del() above.
Adding a hardcoded cpu_physical_memory_map() testcase in vhost_commit() for phys_addr=0xed000, len=5124 (vq ring) does locate the correct *section from address_space_map(), which correct points to the section generated by the last vhost_region_add() above:
Entering vhost_commit >>>>>>>>>>>>>>>>>>>>>>>>>>> address_space_map: addr: 0xed000, plen: 5124 address_space_map: l: 4096, len: 5124 address_space_map: section: 0x7f41b325f020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7f41b325f020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: l: 4096, len: 1028 address_space_map: section: 0x7f41b325f020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7f41b325f020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: Calling qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124 address_space_map: After qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124 cpu_physical_memory_map(0xed000) got l: 5124
So, does using a ->commit callback for MemoryListener mean that vhost_verify_ring_mappings() is OK to be called only from the final ->commit callback, and not from each ->region_del + ->region_add callback..? Eg: I seem to recall something about vhost_verify_ring_mappings() being called during each ->region_del() when dev->started == true was important, no..?
If this OK, then it seems a matter of keeping an updated bit for each of the regions in vhost_dev->mem_sections[] and performing the vhost_verify_ring_mappings() on all three above during the final ->commit() call, right..?
Or even better, what about just invoking vhost_verify_ring_mappings() once from ->commit without section start_addr+size + drop the existing !rings_overlap check..?
Does the following look like you'd expect for marking 0xc0000 area as read-only case..?
Entering vhost_begin >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Entering vhost_region_del section: 0x7f16d3a1db60 offset_within_region: 0xc0000 size: 2146697216 readonly: 0 vhost_region_del: is_rom: 0, rom_device: 0 vhost_region_del: readable: 1 vhost_region_del: ram_addr 0x0, addr: 0x0 size: 2147483648 vhost_region_del: name: pc.ram Entering vhost_set_memory, section: 0x7f16d3a1db60 add: 0, dev->started: 1 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc0000 Entering vhost_region_add section: 0x7f16d3a1daa0 offset_within_region: 0xc0000 size: 32768 readonly: 1 vhost_region_add is readonly !!!!!!!!!!!!!!!!!!! vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7f16d3a1daa0 add: 1, dev->started: 1 vhost_dev_assign_memory(); >>>>>>>>>>>>>>>>>>>>>>>>>>>> reg->guest_phys_addr: 0xc0000 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc0000 Entering vhost_region_add section: 0x7f16d3a1daa0 offset_within_region: 0xc8000 size: 2146664448 readonly: 0 vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7f16d3a1daa0 add: 1, dev->started: 1 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc8000 phys_page_find got PHYS_MAP_NODE_NIL >>>>>>>>>>>>>>>>>>>>>>.. Entering vhost_commit >>>>>>>>>>>>>>>>>>>>>>>>>>> vhost_commit, skipping vhost_verify_ring_mappings without start_addr !!!!!!!!!!!!! Entering verify_ring_mappings: start_addr 0x0000000000000000 size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0xed000 ring_size: 5124 verify_ring_mappings: calling cpu_physical_memory_map ring_phys: 0xed000 l: 5124 address_space_map: addr: 0xed000, plen: 5124 address_space_map: l: 4096, len: 5124 address_space_map: section: 0x7f16d92c6020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7f16d92c6020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: l: 4096, len: 1028 address_space_map: section: 0x7f16d92c6020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7f16d92c6020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: Calling qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124 address_space_map: After qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124
Considering the following when the same seabios code snippet:
pci_config_writeb(0x31): bdf: 0x0000 pam: 0x0000005b
is executed to mark an pc.ram area 0xc0000 as readonly:
Entering vhost_begin >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Entering vhost_region_del section: 0x7fd037a4bb60 offset_within_region: 0xc0000 size: 2146697216 readonly: 0 vhost_region_del: is_rom: 0, rom_device: 0 vhost_region_del: readable: 1 vhost_region_del: ram_addr 0x0, addr: 0x0 size: 2147483648 vhost_region_del: name: pc.ram Entering vhost_set_memory, section: 0x7fd037a4bb60 add: 0, dev->started: 1 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc0000 Entering vhost_region_add section: 0x7fd037a4baa0 offset_within_region: 0xc0000 size: 32768 readonly: 1 vhost_region_add is readonly !!!!!!!!!!!!!!!!!!! vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7fd037a4baa0 add: 1, dev->started: 1 vhost_dev_assign_memory(); >>>>>>>>>>>>>>>>>>>>>>>>>>>> reg->guest_phys_addr: 0xc0000 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc0000 Entering vhost_region_add section: 0x7fd037a4baa0 offset_within_region: 0xc8000 size: 2146664448 readonly: 0 vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7fd037a4baa0 add: 1, dev->started: 1 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc8000 phys_page_find got PHYS_MAP_NODE_NIL >>>>>>>>>>>>>>>>>>>>>>.. Entering vhost_commit >>>>>>>>>>>>>>>>>>>>>>>>>>>
Note that originally we'd see the cpu_physical_memory_map() failure in vhost_verify_ring_mappings() after the first ->region_del() above.
So, does using a ->commit callback for MemoryListener mean that vhost_verify_ring_mappings() is OK to be called only from the final ->commit callback, and not from each ->region_del + ->region_add callback..? Eg: I seem to recall something about vhost_verify_ring_mappings() being called during each ->region_del() when dev->started == true was important, no..?
It is important in the case there are only some deleted regions, and no added region, or in general when the last callback is a ->region_del().
But it is even better to just call vhost_verify_ring_mappings() once, from the ->region_commit() callback.
If this OK, then it seems a matter of keeping an updated bit for each of the regions in vhost_dev->mem_sections[] and performing the vhost_verify_ring_mappings() on all three above during the final ->commit() call, right..?
Or even better, what about just invoking vhost_verify_ring_mappings() once from ->commit without section start_addr+size + drop the existing !rings_overlap check..?
Either drop the ranges_overlap check, or keep the start_addr/end_addr up-to-date. Compared to Michael's prototype patch, that would be like this:
vhost_begin() { dev->mem_changed_end_addr = 0; dev->mem_changed_start_addr = -1; }
vhost_set_memory() { ... dev->mem_changed_start_addr = MIN(dev->mem_changed_start_addr, start_addr); dev->mem_changed_end_addr = MAX(dev->mem_changed_end_addr, start_addr + size - 1); }
vhost_commit() { if (dev->mem_changed_start_addr > dev->mem_changed_end_addr) { return; } start_addr = dev->mem_changed_start_addr; size = dev->mem_changed_end_addr - dev->mem_changed_start_addr + 1; ... }
Paolo
Does the following look like you'd expect for marking 0xc0000 area as read-only case..?
Entering vhost_begin >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Entering vhost_region_del section: 0x7f16d3a1db60 offset_within_region: 0xc0000 size: 2146697216 readonly: 0 vhost_region_del: is_rom: 0, rom_device: 0 vhost_region_del: readable: 1 vhost_region_del: ram_addr 0x0, addr: 0x0 size: 2147483648 vhost_region_del: name: pc.ram Entering vhost_set_memory, section: 0x7f16d3a1db60 add: 0, dev->started: 1 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc0000 Entering vhost_region_add section: 0x7f16d3a1daa0 offset_within_region: 0xc0000 size: 32768 readonly: 1 vhost_region_add is readonly !!!!!!!!!!!!!!!!!!! vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7f16d3a1daa0 add: 1, dev->started: 1 vhost_dev_assign_memory(); >>>>>>>>>>>>>>>>>>>>>>>>>>>> reg->guest_phys_addr: 0xc0000 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc0000 Entering vhost_region_add section: 0x7f16d3a1daa0 offset_within_region: 0xc8000 size: 2146664448 readonly: 0 vhost_region_add: is_rom: 0, rom_device: 0 vhost_region_add: readable: 1 vhost_region_add: ram_addr 0x0000000000000000, addr: 0x 0 size: 2147483648 vhost_region_add: name: pc.ram Entering vhost_set_memory, section: 0x7f16d3a1daa0 add: 1, dev->started: 1 vhost_set_memory: Setting dev->memory_changed = true for start_addr: 0xc8000 phys_page_find got PHYS_MAP_NODE_NIL >>>>>>>>>>>>>>>>>>>>>>.. Entering vhost_commit >>>>>>>>>>>>>>>>>>>>>>>>>>> vhost_commit, skipping vhost_verify_ring_mappings without start_addr !!!!!!!!!!!!! Entering verify_ring_mappings: start_addr 0x0000000000000000 size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0x0 ring_size: 0 verify_ring_mappings: ring_phys 0xed000 ring_size: 5124 verify_ring_mappings: calling cpu_physical_memory_map ring_phys: 0xed000 l: 5124 address_space_map: addr: 0xed000, plen: 5124 address_space_map: l: 4096, len: 5124 address_space_map: section: 0x7f16d92c6020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7f16d92c6020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: l: 4096, len: 1028 address_space_map: section: 0x7f16d92c6020 memory_region_is_ram: 1 readonly: 0 address_space_map: section: 0x7f16d92c6020 offset_within_region: 0xc8000 section size: 2146664448 address_space_map: Calling qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124 address_space_map: After qemu_ram_ptr_length: raddr: 0x ed000 rlen: 5124
On Thu, 2013-03-28 at 11:04 +0200, Michael S. Tsirkin wrote:
On Thu, Mar 28, 2013 at 12:35:42AM -0700, Nicholas A. Bellinger wrote:
On Wed, 2013-03-27 at 23:45 -0700, Nicholas A. Bellinger wrote:
On Wed, 2013-03-27 at 15:33 -0700, Nicholas A. Bellinger wrote:
On Wed, 2013-03-27 at 23:56 +0200, Michael S. Tsirkin wrote:
On Wed, Mar 27, 2013 at 02:31:27PM -0700, Nicholas A. Bellinger wrote:
<SNIP>
locking shadow ram romend: 0x000cb800 romtop: 0x000ec000 mem: 0x000c0000, pam: 0x0000005a Calling pci_config_writeb(0x11): bdf: 0x0000 pam: 0x0000005a
<No QEMU output after pci_config_writeb(0x11) in make_bios_readonly..>
Calling pci_config_writeb(0x31): bdf: 0x0000 pam: 0x0000005b ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
mem: 0x000c8000, pam: 0x0000005b romend: 0x000cb800 mem + 16*1024: 0x000cc000 romtop: 0x000ec000 mem + 32*1024: 0x000d0000
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
romend: 0x000cb800, mem: 0x000c8000, romtop: 0x000ec000, mem + 16*1024: 0x000cc000 Calling pci_config_writeb(0x31): bdf: 0x0000 pam: 0x0000005b
<QEMU output after pci_config_writeb(0x31) in make_bios_readonly..>
vhost_set_memory: section: 0x7fe2801f2b60 section->size: 2146697216 add: 0 Before vhost_verify_ring_mappings: start_addr: c0000 size: 2146697216 Checking vq: 0 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>.
This is also a bug. -net always initializes VQs 0..N so this is what vhost assumed. Please teach vhost that it should skip uninitialized VQs. There are more places to fix. Basically look for if (!virtio_queue_get_num(vdev, queue_no)), all of them need to be updated to skip uninitialized vqs. Probably switch to a new API checking PA too. See patch below.
<nod>
Got ranges_overlap for vq: 0 ring_phys: 0 ring_size: 1028 Checking vq: 1 ring_phys: 0 ring_size: 1028 >>>>>>>>>>>>>>>>>>. Got ranges_overlap for vq: 1 ring_phys: 0 ring_size: 1028 Checking vq: 2 ring_phys: ed000 ring_size: 5124 >>>>>>>>>>>>>>>>>>. Calling l: 5124 for start_addr: c0000 for vq 2 Unable to map ring buffer for ring 2 l: 4096 ring_size: 5124
okay so the ring address is within ROM. Unsurprisingly it fails. bios should stop device before write protect.
<SNIP>
virtio: add API to check that ring is setup
virtio scsi makes it legal to only setup a subset of rings. The only way to detect the ring is setup seems to be to check whether PA was written to. Add API to do this, and teach code to use it instead of checking hardware queue size.
Signed-off-by: Michael S. Tsirkin mst@redhat.com
--->
diff --git a/hw/virtio.c b/hw/virtio.c index 26fbc79..ac12c01 100644 --- a/hw/virtio.c +++ b/hw/virtio.c @@ -651,6 +651,11 @@ int virtio_queue_get_num(VirtIODevice *vdev, int n) return vdev->vq[n].vring.num; }
+bool virtio_queue_valid(VirtIODevice *vdev, int n) +{
- return vdev->vq[n].vring.num && vdev->vq[n].vring.pa;
+}
I assume you mean vring.desc here, right..?
Sending out these as a separate patch series shortly.
--nab