These patches add support for booting from virtio-blk (PCI) devices as used by QEMU.
Patches 1 and 2 add bootindex support which is the new way for passing boot order information from QEMU to OpenBIOS. Note that there are also corresponding patches for QEMU required, without which the generated device paths will be incorrect and cause virtio boot to fail.
Patch 3 adds the legacy virtio-blk driver (it follows the 0.9 rather than the 1.0 specification).
Patch 4 enables the new driver for both PPC and SPARC64 architectures so that virtio-blk devices will be usable out-of-the-box with QEMU.
Once these patches have been applied to OpenBIOS (along with the corresponding fw path fixes for QEMU) it is possible to boot from virtio-blk devices like this:
PPC:
./qemu-system-ppc -drive file=debian-9.0-powerpc-NETINST.iso,if=none,index=0,id=cd,media=cdrom \ -device virtio-blk-pci,drive=cd,bootindex=0 -m 256 -boot d
SPARC:
./qemu-system-sparc64 -drive debian-9.0-sparc64-NETINST.iso,if=none,index=0,id=cd,media=cdrom \ -device virtio-blk-pci,bus=pciB,drive=cd,bootindex=0 -m 256 -boot d -nographic
Signed-off-by: Mark Cave-Ayland mark.cave-ayland@ilande.co.uk
Mark Cave-Ayland (4): ppc: add bootindex support SPARC64: add bootindex support drivers: add legacy virtio-blk driver config: enable virtio-blk driver for default PPC and SPARC64 builds
arch/ppc/qemu/init.c | 55 ++++- arch/sparc64/boot.c | 2 +- arch/sparc64/boot.h | 2 +- arch/sparc64/openbios.c | 38 +++- config/examples/ppc_config.xml | 1 + config/examples/sparc64_config.xml | 1 + drivers/build.xml | 1 + drivers/pci.c | 18 ++ drivers/pci_database.c | 4 +- drivers/pci_database.h | 1 + drivers/virtio.c | 424 +++++++++++++++++++++++++++++++++++++ drivers/virtio.h | 331 +++++++++++++++++++++++++++++ include/drivers/drivers.h | 4 + 13 files changed, 862 insertions(+), 20 deletions(-) create mode 100644 drivers/virtio.c create mode 100644 drivers/virtio.h
This provides an alternative mechanism for supporting boot device order information from QEMU compared with the legacy FW_CFG_BOOT_DEVICE functionality specified via -boot.
Signed-off-by: Mark Cave-Ayland mark.cave-ayland@ilande.co.uk --- arch/ppc/qemu/init.c | 55 ++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 45 insertions(+), 10 deletions(-)
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index af15682..23cc88e 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -839,10 +839,11 @@ arch_of_init(void) #endif uint64_t ram_size; const struct cpudef *cpu; - char buf[64], qemu_uuid[16]; + char buf[256], qemu_uuid[16]; const char *stdin_path, *stdout_path, *boot_path; uint32_t temp = 0; - char *boot_device; + char *boot_device, *bootorder_file; + uint32_t bootorder_sz, sz; ofmem_t *ofmem = ofmem_arch_get_private();
openbios_init(); @@ -1053,11 +1054,17 @@ arch_of_init(void) push_str("/options"); fword("find-device");
- /* Setup default boot devices (not overriding user settings) */ - fword("boot-device"); - boot_device = pop_fstr_copy(); - if (boot_device && strcmp(boot_device, "disk") == 0) { - switch (fw_cfg_read_i16(FW_CFG_BOOT_DEVICE)) { + /* Boot order */ + bootorder_file = fw_cfg_read_file("bootorder", &bootorder_sz); + + if (bootorder_file == NULL) { + /* No bootorder present, use fw_cfg device if no custom + boot-device specified */ + fword("boot-device"); + boot_device = pop_fstr_copy(); + + if (boot_device && strcmp(boot_device, "disk") == 0) { + switch (fw_cfg_read_i16(FW_CFG_BOOT_DEVICE)) { case 'c': boot_path = "hd"; break; @@ -1065,15 +1072,43 @@ arch_of_init(void) case 'd': boot_path = "cd"; break; + } + + snprintf(buf, sizeof(buf), + "%s:,\\:tbxi " + "%s:,\ppc\bootinfo.txt " + "%s:,%%BOOT", + boot_path, boot_path, boot_path); + + push_str(buf); + fword("encode-string"); + push_str("boot-device"); + fword("property"); }
- snprintf(buf, sizeof(buf), "%s:,\\:tbxi %s:,\ppc\bootinfo.txt %s:,%%BOOT", boot_path, boot_path, boot_path); - push_str(buf); + free(boot_device); + } else { + sz = bootorder_sz * (3 * 2); + boot_device = malloc(sz); + memset(boot_device, 0, sz); + + while ((boot_path = strsep(&bootorder_file, "\n")) != NULL) { + snprintf(buf, sizeof(buf), + "%s:,\\:tbxi " + "%s:,\ppc\bootinfo.txt " + "%s:,%%BOOT ", + boot_path, boot_path, boot_path); + + strncat(boot_device, buf, sz); + } + + push_str(boot_device); fword("encode-string"); push_str("boot-device"); fword("property"); + + free(boot_device); } - free(boot_device);
/* Set up other properties */
This provides an alternative mechanism for supporting boot device order information from QEMU compared with the legacy FW_CFG_BOOT_DEVICE functionality specified via -boot.
Signed-off-by: Mark Cave-Ayland mark.cave-ayland@ilande.co.uk --- arch/sparc64/boot.c | 2 +- arch/sparc64/boot.h | 2 +- arch/sparc64/openbios.c | 38 ++++++++++++++++++++++++++++++++------ 3 files changed, 34 insertions(+), 8 deletions(-)
diff --git a/arch/sparc64/boot.c b/arch/sparc64/boot.c index 7a287f2..54f6f7a 100644 --- a/arch/sparc64/boot.c +++ b/arch/sparc64/boot.c @@ -15,7 +15,7 @@ uint64_t kernel_image; uint64_t kernel_size; uint64_t qemu_cmdline; uint64_t cmdline_size; -char boot_device; +char *boot_device;
extern int sparc64_of_client_interface( int *params );
diff --git a/arch/sparc64/boot.h b/arch/sparc64/boot.h index e1b8717..88beb1c 100644 --- a/arch/sparc64/boot.h +++ b/arch/sparc64/boot.h @@ -14,7 +14,7 @@ extern uint64_t kernel_image; extern uint64_t kernel_size; extern uint64_t qemu_cmdline; extern uint64_t cmdline_size; -extern char boot_device; +extern char *boot_device; extern void boot(void);
// sys_info.c diff --git a/arch/sparc64/openbios.c b/arch/sparc64/openbios.c index e9e08fd..b2e79d0 100644 --- a/arch/sparc64/openbios.c +++ b/arch/sparc64/openbios.c @@ -26,6 +26,7 @@ #include "arch/common/fw_cfg.h" #include "arch/sparc64/ofmem_sparc64.h" #include "spitfire.h" +#include "libc/vsprintf.h"
#define UUID_FMT "%02x%02x%02x%02x-%02x%02x-%02x%02x-%02x%02x-%02x%02x%02x%02x%02x%02x"
@@ -538,6 +539,8 @@ void arch_nvram_get(char *data) uint32_t clock_frequency; uint16_t machine_id; const char *stdin_path, *stdout_path; + char *bootorder_file, *boot_path; + uint32_t bootorder_sz, sz;
fw_cfg_init();
@@ -570,7 +573,6 @@ void arch_nvram_get(char *data) } qemu_cmdline = (uint64_t)obio_cmdline; cmdline_size = size; - boot_device = fw_cfg_read_i16(FW_CFG_BOOT_DEVICE);
if (kernel_size) printk("kernel addr %llx size %llx\n", kernel_image, kernel_size); @@ -630,7 +632,11 @@ void arch_nvram_get(char *data) push_str("/options"); fword("find-device");
- switch (boot_device) { + /* Boot order */ + bootorder_file = fw_cfg_read_file("bootorder", &bootorder_sz); + + if (bootorder_file == NULL) { + switch (fw_cfg_read_i16(FW_CFG_BOOT_DEVICE)) { case 'a': push_str("/obio/SUNW,fdtwo"); break; @@ -644,11 +650,31 @@ void arch_nvram_get(char *data) case 'n': push_str("net"); break; - } + }
- fword("encode-string"); - push_str("boot-device"); - fword("property"); + fword("encode-string"); + push_str("boot-device"); + fword("property"); + } else { + sz = bootorder_sz * (3 * 2); + boot_device = malloc(sz); + memset(boot_device, 0, sz); + + while ((boot_path = strsep(&bootorder_file, "\n")) != NULL) { + snprintf(buf, sizeof(buf), + "%s:f " + "%s:a " + "%s ", + boot_path, boot_path, boot_path); + + strncat(boot_device, buf, sz); + } + + push_str(boot_device); + fword("encode-string"); + push_str("boot-device"); + fword("property"); + }
push_str(obio_cmdline); fword("encode-string");
Note that as part of this commit we rename the virtio-blk device node from "virtio-blk" to "scsi" to match up with the QEMU fw path generator.
Signed-off-by: Mark Cave-Ayland mark.cave-ayland@ilande.co.uk --- drivers/build.xml | 1 + drivers/pci.c | 18 ++ drivers/pci_database.c | 4 +- drivers/pci_database.h | 1 + drivers/virtio.c | 424 ++++++++++++++++++++++++++++++++++++++++++++++ drivers/virtio.h | 331 ++++++++++++++++++++++++++++++++++++ include/drivers/drivers.h | 4 + 7 files changed, 781 insertions(+), 2 deletions(-) create mode 100644 drivers/virtio.c create mode 100644 drivers/virtio.h
diff --git a/drivers/build.xml b/drivers/build.xml index 8df074b..5a28bc2 100644 --- a/drivers/build.xml +++ b/drivers/build.xml @@ -28,6 +28,7 @@ <object source="usbohci.c" condition="DRIVER_USB"/> <object source="usbohci_rh.c" condition="DRIVER_USB"/> <object source="lsi.c" condition="DRIVER_LSI_53C810"/> + <object source="virtio.c" condition="DRIVER_VIRTIO_BLK"/> </library>
<dictionary name="openbios" target="forth"> diff --git a/drivers/pci.c b/drivers/pci.c index 7a174b4..bb4dda9 100644 --- a/drivers/pci.c +++ b/drivers/pci.c @@ -770,6 +770,24 @@ int sungem_config_cb (const pci_config_t *config) return 0; }
+int virtio_blk_config_cb(const pci_config_t *config) +{ +#ifdef CONFIG_DRIVER_VIRTIO_BLK + pci_addr addr; + uint8_t idx; + + addr = PCI_ADDR( + PCI_BUS(config->dev), + PCI_DEV(config->dev), + PCI_FN(config->dev)); + + idx = (uint8_t)(pci_config_read16(addr, PCI_DEVICE_ID) & 0xff) - 1; + + ob_virtio_init(config->path, "virtio-blk", arch->io_base, config->assigned[0] & ~0x0000000F, idx); +#endif + return 0; +} + /* * "Designing PCI Cards and Drivers for Power Macintosh Computers", p. 454 * diff --git a/drivers/pci_database.c b/drivers/pci_database.c index 8288acd..40fdeb6 100644 --- a/drivers/pci_database.c +++ b/drivers/pci_database.c @@ -49,10 +49,10 @@ static const pci_dev_t scsi_devices[] = { { /* Virtio-block controller */ PCI_VENDOR_ID_REDHAT_QUMRANET, PCI_DEVICE_ID_VIRTIO_BLOCK, - NULL, "virtio-blk", NULL, + NULL, "scsi", NULL, "pci1af4,1001\0pci1af4,1001\0pciclass,01018f\0", 0, 0, 0, - NULL, NULL, + virtio_blk_config_cb, NULL, }, { /* lsi53c810 controller */ diff --git a/drivers/pci_database.h b/drivers/pci_database.h index 6f5eb39..e39ebfb 100644 --- a/drivers/pci_database.h +++ b/drivers/pci_database.h @@ -29,6 +29,7 @@ struct pci_dev_t { };
extern int ide_config_cb2(const pci_config_t *config); +extern int virtio_blk_config_cb(const pci_config_t *config); extern int eth_config_cb(const pci_config_t *config); extern int macio_heathrow_config_cb(const pci_config_t *config); extern int macio_keylargo_config_cb(const pci_config_t *config); diff --git a/drivers/virtio.c b/drivers/virtio.c new file mode 100644 index 0000000..fb14051 --- /dev/null +++ b/drivers/virtio.c @@ -0,0 +1,424 @@ +/* + * OpenBIOS Legacy Virtio driver + * + * Copyright (c) 2013 Alexander Graf agraf@suse.de + * Copyright (c) 2016 Mark Cave-Ayland mark.cave-ayland@ilande.co.uk + * + * This work is licensed under the terms of the GNU GPL, version 2 or (at + * your option) any later version. See the COPYING file in the top-level + * directory. + */ + +#include "config.h" +#include "libc/byteorder.h" +#include "libc/vsprintf.h" +#include "libopenbios/bindings.h" +#include "libopenbios/ofmem.h" +#include "kernel/kernel.h" +#include "drivers/drivers.h" + +#include "virtio.h" + +#define VRING_WAIT_REPLY_TIMEOUT 10000 + + +static uint8_t virtio_cfg_read8(VDev *vdev, int addr) +{ + return inb((uint32_t)(vdev->io_base + addr)); +} + +static void virtio_cfg_write8(VDev *vdev, int addr, uint8_t value) +{ + outb(value, vdev->io_base + addr); +} + +static uint16_t virtio_cfg_read16(VDev *vdev, int addr) +{ + return inw(vdev->io_base + addr); +} + +static void virtio_cfg_write16(VDev *vdev, int addr, uint16_t value) +{ + outw(value, vdev->io_base + addr); +} + +static uint32_t virtio_cfg_read32(VDev *vdev, int addr) +{ + return inl(vdev->io_base + addr); +} + +static void virtio_cfg_write32(VDev *vdev, int addr, uint32_t value) +{ + outl(value, vdev->io_base + addr); +} + +static uint64_t virtio_cfg_read64(VDev *vdev, int addr) +{ + uint64_t q; + uint8_t *p; + int i; + + for (i = 0, p = (uint8_t *)&q; i < 8; i++, p++) { + *p = virtio_cfg_read8(vdev, addr + i); + } + + return q; +} + +static long virtio_notify(VDev *vdev, int vq_idx, long cookie) +{ + virtio_cfg_write16(vdev, VIRTIO_PCI_QUEUE_NOTIFY, vq_idx); + + return 0; +} + +/*********************************************** + * Virtio functions * + ***********************************************/ + +static void vring_init(VRing *vr, VqInfo *info) +{ + void *p = (void *) (uintptr_t)info->queue; + + vr->id = info->index; + vr->num = info->num; + vr->desc = p; + vr->avail = (void *)((uintptr_t)p + info->num * sizeof(VRingDesc)); + vr->used = (void *)(((unsigned long)&vr->avail->ring[info->num] + + info->align - 1) & ~(info->align - 1)); + + /* Zero out all relevant field */ + vr->avail->flags = 0; + vr->avail->idx = 0; + + /* We're running with interrupts off anyways, so don't bother */ + vr->used->flags = VRING_USED_F_NO_NOTIFY; + vr->used->idx = 0; + vr->used_idx = 0; + vr->next_idx = 0; + vr->cookie = 0; +} + +static int vring_notify(VDev *vdev, VRing *vr) +{ + return virtio_notify(vdev, vr->id, vr->cookie); +} + +static void vring_send_buf(VRing *vr, void *p, int len, int flags) +{ + ucell mode; + + /* For follow-up chains we need to keep the first entry point */ + if (!(flags & VRING_HIDDEN_IS_CHAIN)) { + vr->avail->ring[vr->avail->idx % vr->num] = vr->next_idx; + } + + vr->desc[vr->next_idx].addr = ofmem_translate((ucell)p, &mode); + vr->desc[vr->next_idx].len = len; + vr->desc[vr->next_idx].flags = flags & ~VRING_HIDDEN_IS_CHAIN; + vr->desc[vr->next_idx].next = vr->next_idx; + vr->desc[vr->next_idx].next++; + vr->next_idx++; + + /* Chains only have a single ID */ + if (!(flags & VRING_DESC_F_NEXT)) { + vr->avail->idx++; + } +} + +static int vr_poll(VDev *vdev, VRing *vr) +{ + if (vr->used->idx == vr->used_idx) { + vring_notify(vdev, vr); + return 0; + } + + vr->used_idx = vr->used->idx; + vr->next_idx = 0; + vr->desc[0].len = 0; + vr->desc[0].flags = 0; + return 1; /* vr has been updated */ +} + +/* + * Wait for the host to reply. + * + * timeout is in msecs if > 0. + * + * Returns 0 on success, 1 on timeout. + */ +static int vring_wait_reply(VDev *vdev) +{ + ucell target_ms, get_ms; + + fword("get-msecs"); + target_ms = POP(); + target_ms += vdev->wait_reply_timeout; + + /* Wait for any queue to be updated by the host */ + do { + int i, r = 0; + + for (i = 0; i < vdev->nr_vqs; i++) { + r += vr_poll(vdev, &vdev->vrings[i]); + } + + if (r) { + return 0; + } + + fword("get-msecs"); + get_ms = POP(); + + } while (!vdev->wait_reply_timeout || (get_ms < target_ms)); + + return 1; +} + +/*********************************************** + * Virtio block * + ***********************************************/ + +static int virtio_blk_read_many(VDev *vdev, + uint64_t offset, void *load_addr, int len) +{ + VirtioBlkOuthdr out_hdr; + u8 status; + VRing *vr = &vdev->vrings[vdev->cmd_vr_idx]; + uint8_t discard[VIRTIO_SECTOR_SIZE]; + + uint64_t start_sector = offset / virtio_get_block_size(vdev); + int head_len = offset & (virtio_get_block_size(vdev) - 1); + uint64_t end_sector = (offset + len + virtio_get_block_size(vdev) - 1) / + virtio_get_block_size(vdev); + int tail_len = end_sector * virtio_get_block_size(vdev) - (offset + len); + + /* Tell the host we want to read */ + out_hdr.type = VIRTIO_BLK_T_IN; + out_hdr.ioprio = 99; + out_hdr.sector = virtio_sector_adjust(vdev, start_sector); + + vring_send_buf(vr, &out_hdr, sizeof(out_hdr), VRING_DESC_F_NEXT); + + /* Discarded head */ + if (head_len) { + vring_send_buf(vr, discard, head_len, + VRING_DESC_F_WRITE | VRING_HIDDEN_IS_CHAIN | + VRING_DESC_F_NEXT); + } + + /* This is where we want to receive data */ + vring_send_buf(vr, load_addr, len, + VRING_DESC_F_WRITE | VRING_HIDDEN_IS_CHAIN | + VRING_DESC_F_NEXT); + + /* Discarded tail */ + if (tail_len) { + vring_send_buf(vr, discard, tail_len, + VRING_DESC_F_WRITE | VRING_HIDDEN_IS_CHAIN | + VRING_DESC_F_NEXT); + } + + /* status field */ + vring_send_buf(vr, &status, sizeof(u8), + VRING_DESC_F_WRITE | VRING_HIDDEN_IS_CHAIN); + + /* Now we can tell the host to read */ + vring_wait_reply(vdev); + + return status; +} + +int virtio_read_many(VDev *vdev, uint64_t offset, void *load_addr, int len) +{ + switch (vdev->senseid) { + case VIRTIO_ID_BLOCK: + return virtio_blk_read_many(vdev, offset, load_addr, len); + } + return -1; +} + +static int virtio_read(VDev *vdev, uint64_t offset, void *load_addr, int len) +{ + return virtio_read_many(vdev, offset, load_addr, len); +} + +int virtio_get_block_size(VDev *vdev) +{ + switch (vdev->senseid) { + case VIRTIO_ID_BLOCK: + return vdev->config.blk.blk_size << vdev->config.blk.physical_block_exp; + } + return 0; +} + +static void +ob_virtio_disk_open(VDev **_vdev) +{ + VDev *vdev = *_vdev; + phandle_t ph; + + vdev->pos = 0; + + /* interpose disk-label */ + ph = find_dev("/packages/disk-label"); + fword("my-args"); + PUSH_ph( ph ); + fword("interpose"); + + RET(-1); +} + +static void +ob_virtio_disk_close(VDev **_vdev) +{ + return; +} + +/* ( pos.d -- status ) */ +static void +ob_virtio_disk_seek(VDev **_vdev) +{ + VDev *vdev = *_vdev; + uint64_t pos; + + pos = ((uint64_t)POP()) << 32; + pos |= POP(); + + /* Make sure we are within the physical limits */ + if (pos < (vdev->config.blk.capacity * virtio_get_block_size(vdev))) { + vdev->pos = pos; + PUSH(0); + } else { + PUSH(1); + } + + return; +} + +/* ( addr len -- actual ) */ +static void +ob_virtio_disk_read(VDev **_vdev) +{ + VDev *vdev = *_vdev; + ucell len = POP(); + uint8_t *addr = (uint8_t *)POP(); + + virtio_read(vdev, vdev->pos, addr, len); + + vdev->pos += len; + + PUSH(len); +} + +static void set_virtio_alias(const char *path, int idx) +{ + phandle_t aliases; + char name[9]; + + aliases = find_dev("/aliases"); + + snprintf(name, sizeof(name), "virtio%d", idx); + + set_property(aliases, name, path, strlen(path) + 1); +} + +static void +ob_virtio_disk_initialize(VDev **_vdev) +{ + phandle_t ph = get_cur_dev(); + int len, i; + uint8_t status; + + VDev *vdev; + VRing *block = malloc(sizeof(VRing) * VIRTIO_MAX_VQS); + void *ring_area; + + vdev = malloc(sizeof(VDev)); + vdev->io_base = get_int_property(ph, "_address", &len); + push_str("_address"); + feval("delete-property"); + + /* Indicate we recognise the device */ + status = virtio_cfg_read8(vdev, VIRTIO_PCI_STATUS); + status |= VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER; + virtio_cfg_write8(vdev, VIRTIO_PCI_STATUS, status); + + vdev->senseid = VIRTIO_ID_BLOCK; + vdev->nr_vqs = 1; + vdev->cmd_vr_idx = 0; + vdev->wait_reply_timeout = VRING_WAIT_REPLY_TIMEOUT; + vdev->scsi_block_size = VIRTIO_SCSI_BLOCK_SIZE; + vdev->blk_factor = 1; + + vdev->vrings = block; + ofmem_posix_memalign(&ring_area, VIRTIO_RING_SIZE * VIRTIO_MAX_VQS, PAGE_SIZE); + vdev->ring_area = ring_area; + + for (i = 0; i < vdev->nr_vqs; i++) { + VqInfo info = { + .queue = (uintptr_t) vdev->ring_area + (i * VIRTIO_RING_SIZE), + .align = VIRTIO_PCI_VRING_ALIGN, + .index = i, + .num = 0, + }; + + virtio_cfg_write16(vdev, VIRTIO_PCI_QUEUE_SEL, i); + info.num = virtio_cfg_read16(vdev, VIRTIO_PCI_QUEUE_NUM); + + vring_init(&vdev->vrings[i], &info); + + /* Set block information */ + vdev->guessed_disk_nature = VIRTIO_GDN_NONE; + vdev->config.blk.blk_size = VIRTIO_SECTOR_SIZE; + vdev->config.blk.physical_block_exp = 0; + + /* Read sectors */ + vdev->config.blk.capacity = virtio_cfg_read64(vdev, 0x14); + + /* Set queue address */ + virtio_cfg_read32(vdev, VIRTIO_PCI_QUEUE_PFN); + virtio_cfg_write32(vdev, VIRTIO_PCI_QUEUE_PFN, + va2pa(info.queue) >> VIRTIO_PCI_QUEUE_ADDR_SHIFT); + } + + /* Initialisation complete */ + status |= VIRTIO_CONFIG_S_DRIVER_OK; + virtio_cfg_write8(vdev, VIRTIO_PCI_STATUS, status); + + *_vdev = vdev; +} + +DECLARE_UNNAMED_NODE(ob_virtio_disk, 0, sizeof(VDev *)); + +NODE_METHODS(ob_virtio_disk) = { + { NULL, ob_virtio_disk_initialize }, + { "open", ob_virtio_disk_open }, + { "close", ob_virtio_disk_close }, + { "seek", ob_virtio_disk_seek }, + { "read", ob_virtio_disk_read }, +}; + +void ob_virtio_init(const char *path, const char *dev_name, uint64_t base, + uint64_t offset, int idx) +{ + char buf[256]; + + fword("new-device"); + push_str("disk"); + fword("device-name"); + push_str("block"); + fword("device-type"); + + PUSH(offset); + fword("encode-int"); + push_str("_address"); + fword("property"); + + fword("finish-device"); + + snprintf(buf, sizeof(buf), "%s/disk", path); + REGISTER_NODE_METHODS(ob_virtio_disk, buf); + + set_virtio_alias(path, idx); +} diff --git a/drivers/virtio.h b/drivers/virtio.h new file mode 100644 index 0000000..69ac4a8 --- /dev/null +++ b/drivers/virtio.h @@ -0,0 +1,331 @@ +/* + * Virtio driver bits + * + * Copyright (c) 2013 Alexander Graf agraf@suse.de + * + * This work is licensed under the terms of the GNU GPL, version 2 or (at + * your option) any later version. See the COPYING file in the top-level + * directory. + */ + +#ifndef VIRTIO_H +#define VIRTIO_H + +/* A 32-bit r/o bitmask of the features supported by the host */ +#define VIRTIO_PCI_HOST_FEATURES 0 + +/* A 32-bit r/w bitmask of features activated by the guest */ +#define VIRTIO_PCI_GUEST_FEATURES 4 + +/* A 32-bit r/w PFN for the currently selected queue */ +#define VIRTIO_PCI_QUEUE_PFN 8 + +/* A 16-bit r/o queue size for the currently selected queue */ +#define VIRTIO_PCI_QUEUE_NUM 12 + +/* A 16-bit r/w queue selector */ +#define VIRTIO_PCI_QUEUE_SEL 14 + +/* A 16-bit r/w queue notifier */ +#define VIRTIO_PCI_QUEUE_NOTIFY 16 + +/* An 8-bit device status register. */ +#define VIRTIO_PCI_STATUS 18 + +/* An 8-bit r/o interrupt status register. Reading the value will return the + * current contents of the ISR and will also clear it. This is effectively + * a read-and-acknowledge. */ +#define VIRTIO_PCI_ISR 19 + +/* MSI-X registers: only enabled if MSI-X is enabled. */ +/* A 16-bit vector for configuration changes. */ +#define VIRTIO_MSI_CONFIG_VECTOR 20 +/* A 16-bit vector for selected queue notifications. */ +#define VIRTIO_MSI_QUEUE_VECTOR 22 + +/* How many bits to shift physical queue address written to QUEUE_PFN. + * 12 is historical, and due to x86 page size. */ +#define VIRTIO_PCI_QUEUE_ADDR_SHIFT 12 + +/* The alignment to use between consumer and producer parts of vring. + * x86 pagesize again. */ +#define VIRTIO_PCI_VRING_ALIGN 4096 + +/* Status byte for guest to report progress, and synchronize features. */ +/* We have seen device and processed generic fields (VIRTIO_CONFIG_F_VIRTIO) */ +#define VIRTIO_CONFIG_S_ACKNOWLEDGE 1 +/* We have found a driver for the device. */ +#define VIRTIO_CONFIG_S_DRIVER 2 +/* Driver has used its parts of the config, and is happy */ +#define VIRTIO_CONFIG_S_DRIVER_OK 4 +/* We've given up on this device. */ +#define VIRTIO_CONFIG_S_FAILED 0x80 + +enum VirtioDevType { + VIRTIO_ID_NET = 1, + VIRTIO_ID_BLOCK = 2, + VIRTIO_ID_CONSOLE = 3, + VIRTIO_ID_BALLOON = 5, + VIRTIO_ID_SCSI = 8, +}; +typedef enum VirtioDevType VirtioDevType; + +struct VirtioDevHeader { + VirtioDevType type:8; + uint8_t num_vq; + uint8_t feature_len; + uint8_t config_len; + uint8_t status; + uint8_t vqconfig[]; +} __attribute__((packed)); +typedef struct VirtioDevHeader VirtioDevHeader; + +struct VirtioVqConfig { + uint64_t token; + uint64_t address; + uint16_t num; + uint8_t pad[6]; +} __attribute__((packed)); +typedef struct VirtioVqConfig VirtioVqConfig; + +struct VqInfo { + uint32_t queue; + uint32_t align; + uint16_t index; + uint16_t num; +} __attribute__((packed)); +typedef struct VqInfo VqInfo; + +struct VqConfig { + uint16_t index; + uint16_t num; +} __attribute__((packed)); +typedef struct VqConfig VqConfig; + +struct VirtioDev { + VirtioDevHeader *header; + VirtioVqConfig *vqconfig; + char *host_features; + char *guest_features; + char *config; +}; +typedef struct VirtioDev VirtioDev; + +#define VIRTIO_RING_SIZE (PAGE_SIZE * 8) +#define VIRTIO_MAX_VQS 3 +#define KVM_S390_VIRTIO_RING_ALIGN 4096 + +#define VRING_USED_F_NO_NOTIFY 1 + +/* This marks a buffer as continuing via the next field. */ +#define VRING_DESC_F_NEXT 1 +/* This marks a buffer as write-only (otherwise read-only). */ +#define VRING_DESC_F_WRITE 2 +/* This means the buffer contains a list of buffer descriptors. */ +#define VRING_DESC_F_INDIRECT 4 + +/* Internal flag to mark follow-up segments as such */ +#define VRING_HIDDEN_IS_CHAIN 256 + +/* Virtio ring descriptors: 16 bytes. These can chain together via "next". */ +struct VRingDesc { + /* Address (guest-physical). */ + uint64_t addr; + /* Length. */ + uint32_t len; + /* The flags as indicated above. */ + uint16_t flags; + /* We chain unused descriptors via this, too */ + uint16_t next; +} __attribute__((packed)); +typedef struct VRingDesc VRingDesc; + +struct VRingAvail { + uint16_t flags; + uint16_t idx; + uint16_t ring[]; +} __attribute__((packed)); +typedef struct VRingAvail VRingAvail; + +/* uint32_t is used here for ids for padding reasons. */ +struct VRingUsedElem { + /* Index of start of used descriptor chain. */ + uint32_t id; + /* Total length of the descriptor chain which was used (written to) */ + uint32_t len; +} __attribute__((packed)); +typedef struct VRingUsedElem VRingUsedElem; + +struct VRingUsed { + uint16_t flags; + uint16_t idx; + VRingUsedElem ring[]; +} __attribute__((packed)); +typedef struct VRingUsed VRingUsed; + +struct VRing { + unsigned int num; + int next_idx; + int used_idx; + VRingDesc *desc; + VRingAvail *avail; + VRingUsed *used; + long cookie; + int id; +}; +typedef struct VRing VRing; + + +/*********************************************** + * Virtio block * + ***********************************************/ + +/* + * Command types + * + * Usage is a bit tricky as some bits are used as flags and some are not. + * + * Rules: + * VIRTIO_BLK_T_OUT may be combined with VIRTIO_BLK_T_SCSI_CMD or + * VIRTIO_BLK_T_BARRIER. VIRTIO_BLK_T_FLUSH is a command of its own + * and may not be combined with any of the other flags. + */ + +/* These two define direction. */ +#define VIRTIO_BLK_T_IN 0 +#define VIRTIO_BLK_T_OUT 1 + +/* This bit says it's a scsi command, not an actual read or write. */ +#define VIRTIO_BLK_T_SCSI_CMD 2 + +/* Cache flush command */ +#define VIRTIO_BLK_T_FLUSH 4 + +/* Barrier before this op. */ +#define VIRTIO_BLK_T_BARRIER 0x80000000 + +/* This is the first element of the read scatter-gather list. */ +struct VirtioBlkOuthdr { + /* VIRTIO_BLK_T* */ + uint32_t type; + /* io priority. */ + uint32_t ioprio; + /* Sector (ie. 512 byte offset) */ + uint64_t sector; +}; +typedef struct VirtioBlkOuthdr VirtioBlkOuthdr; + +struct VirtioBlkConfig { + uint64_t capacity; /* in 512-byte sectors */ + uint32_t size_max; /* max segment size (if VIRTIO_BLK_F_SIZE_MAX) */ + uint32_t seg_max; /* max number of segments (if VIRTIO_BLK_F_SEG_MAX) */ + + struct VirtioBlkGeometry { + uint16_t cylinders; + uint8_t heads; + uint8_t sectors; + } geometry; /* (if VIRTIO_BLK_F_GEOMETRY) */ + + uint32_t blk_size; /* block size of device (if VIRTIO_BLK_F_BLK_SIZE) */ + + /* the next 4 entries are guarded by VIRTIO_BLK_F_TOPOLOGY */ + uint8_t physical_block_exp; /* exponent for physical blk per logical blk */ + uint8_t alignment_offset; /* alignment offset in logical blocks */ + uint16_t min_io_size; /* min I/O size without performance penalty + in logical blocks */ + uint32_t opt_io_size; /* optimal sustained I/O size in logical blks */ + + uint8_t wce; /* writeback mode (if VIRTIO_BLK_F_CONFIG_WCE) */ +} __attribute__((packed)); +typedef struct VirtioBlkConfig VirtioBlkConfig; + +enum guessed_disk_nature_type { + VIRTIO_GDN_NONE = 0, + VIRTIO_GDN_DASD = 1, + VIRTIO_GDN_CDROM = 2, + VIRTIO_GDN_SCSI = 3, +}; +typedef enum guessed_disk_nature_type VirtioGDN; + +#define VIRTIO_SECTOR_SIZE 512 +#define VIRTIO_ISO_BLOCK_SIZE 2048 +#define VIRTIO_SCSI_BLOCK_SIZE 512 + +struct VirtioScsiConfig { + uint32_t num_queues; + uint32_t seg_max; + uint32_t max_sectors; + uint32_t cmd_per_lun; + uint32_t event_info_size; + uint32_t sense_size; + uint32_t cdb_size; + uint16_t max_channel; + uint16_t max_target; + uint32_t max_lun; +} __attribute__((packed)); +typedef struct VirtioScsiConfig VirtioScsiConfig; + +struct ScsiDevice { + uint16_t channel; /* Always 0 in QEMU */ + uint16_t target; /* will be scanned over */ + uint32_t lun; /* will be reported */ +}; +typedef struct ScsiDevice ScsiDevice; + +struct VDev { + uint32_t io_base; + uint64_t pos; + + int nr_vqs; + VRing *vrings; + int cmd_vr_idx; + void *ring_area; + long wait_reply_timeout; + VirtioGDN guessed_disk_nature; + int senseid; + union { + VirtioBlkConfig blk; + VirtioScsiConfig scsi; + } config; + ScsiDevice *scsi_device; + int is_cdrom; + int scsi_block_size; + int blk_factor; + uint64_t scsi_last_block; + uint32_t scsi_dev_cyls; + uint8_t scsi_dev_heads; + int scsi_device_selected; + ScsiDevice selected_scsi_device; +}; +typedef struct VDev VDev; + +extern int virtio_get_block_size(VDev *vdev); +extern uint8_t virtio_get_heads(VDev *vdev); +extern uint8_t virtio_get_sectors(VDev *vdev); +extern uint64_t virtio_get_blocks(VDev *vdev); + +static inline uint64_t virtio_sector_adjust(VDev *vdev, uint64_t sector) +{ + return sector * (virtio_get_block_size(vdev) / VIRTIO_SECTOR_SIZE); +} + +VirtioGDN virtio_guessed_disk_nature(VDev *vdev); +void virtio_assume_scsi(VDev *vdev); +void virtio_assume_eckd(VDev *vdev); +void virtio_assume_iso9660(VDev *vdev); + +extern int virtio_disk_is_scsi(VDev *vdev); +extern int virtio_disk_is_eckd(VDev *vdev); + +int virtio_read_many(VDev *vdev, uint64_t sector, void *load_addr, int sec_num); +VDev *virtio_get_device(void); +VirtioDevType virtio_get_device_type(void); + +struct VirtioCmd { + void *data; + int size; + int flags; +}; +typedef struct VirtioCmd VirtioCmd; + +#endif /* VIRTIO_H */ diff --git a/include/drivers/drivers.h b/include/drivers/drivers.h index 117429e..75172c2 100644 --- a/include/drivers/drivers.h +++ b/include/drivers/drivers.h @@ -139,6 +139,10 @@ int keyboard_dataready(void); unsigned char keyboard_readdata(void); #endif #endif +#ifdef CONFIG_DRIVER_VIRTIO_BLK +void ob_virtio_init(const char *path, const char *dev_name, uint64_t base, + uint64_t offset, int idx); +#endif int macio_get_nvram_size(void); void macio_nvram_put(char *buf); void macio_nvram_get(char *buf);
Signed-off-by: Mark Cave-Ayland mark.cave-ayland@ilande.co.uk --- config/examples/ppc_config.xml | 1 + config/examples/sparc64_config.xml | 1 + 2 files changed, 2 insertions(+)
diff --git a/config/examples/ppc_config.xml b/config/examples/ppc_config.xml index 43277a0..19dc043 100644 --- a/config/examples/ppc_config.xml +++ b/config/examples/ppc_config.xml @@ -85,3 +85,4 @@ <option name="CONFIG_DEBUG_USB" type="boolean" value="false"/> <option name="CONFIG_USB_HID" type="boolean" value="true"/> <option name="CONFIG_DRIVER_LSI_53C810" type="boolean" value="true"/> + <option name="CONFIG_DRIVER_VIRTIO_BLK" type="boolean" value="true"/> diff --git a/config/examples/sparc64_config.xml b/config/examples/sparc64_config.xml index a4e1336..a4172df 100644 --- a/config/examples/sparc64_config.xml +++ b/config/examples/sparc64_config.xml @@ -71,4 +71,5 @@ <option name="CONFIG_DRIVER_PC_KBD" type="boolean" value="true"/> <option name="CONFIG_DRIVER_PC_SERIAL" type="boolean" value="true"/> <option name="CONFIG_DRIVER_FW_CFG" type="boolean" value="true"/> + <option name="CONFIG_DRIVER_VIRTIO_BLK" type="boolean" value="true"/> <option name="CONFIG_FW_CFG_ADDR" type="integer" value="0x510"/>
On Aug 12, 2018, at 9:24 AM, Mark Cave-Ayland mark.cave-ayland@ilande.co.uk wrote:
These patches add support for booting from virtio-blk (PCI) devices as used by QEMU.
Patches 1 and 2 add bootindex support which is the new way for passing boot order information from QEMU to OpenBIOS. Note that there are also corresponding patches for QEMU required, without which the generated device paths will be incorrect and cause virtio boot to fail.
Patch 3 adds the legacy virtio-blk driver (it follows the 0.9 rather than the 1.0 specification).
Patch 4 enables the new driver for both PPC and SPARC64 architectures so that virtio-blk devices will be usable out-of-the-box with QEMU.
Once these patches have been applied to OpenBIOS (along with the corresponding fw path fixes for QEMU) it is possible to boot from virtio-blk devices like this:
PPC:
./qemu-system-ppc -drive file=debian-9.0-powerpc-NETINST.iso,if=none,index=0,id=cd,media=cdrom \ -device virtio-blk-pci,drive=cd,bootindex=0 -m 256 -boot d
SPARC:
./qemu-system-sparc64 -drive debian-9.0-sparc64-NETINST.iso,if=none,index=0,id=cd,media=cdrom \ -device virtio-blk-pci,bus=pciB,drive=cd,bootindex=0 -m 256 -boot d -nographic
Signed-off-by: Mark Cave-Ayland mark.cave-ayland@ilande.co.uk
Mark Cave-Ayland (4): ppc: add bootindex support SPARC64: add bootindex support drivers: add legacy virtio-blk driver config: enable virtio-blk driver for default PPC and SPARC64 builds
arch/ppc/qemu/init.c | 55 ++++- arch/sparc64/boot.c | 2 +- arch/sparc64/boot.h | 2 +- arch/sparc64/openbios.c | 38 +++- config/examples/ppc_config.xml | 1 + config/examples/sparc64_config.xml | 1 + drivers/build.xml | 1 + drivers/pci.c | 18 ++ drivers/pci_database.c | 4 +- drivers/pci_database.h | 1 + drivers/virtio.c | 424 +++++++++++++++++++++++++++++++++++++ drivers/virtio.h | 331 +++++++++++++++++++++++++++++ include/drivers/drivers.h | 4 + 13 files changed, 862 insertions(+), 20 deletions(-) create mode 100644 drivers/virtio.c create mode 100644 drivers/virtio.h
-- 2.11.0
-- OpenBIOS http://openbios.org/ Mailinglist: http://lists.openbios.org/mailman/listinfo Free your System - May the Forth be with you
That’s pretty cool Mark, is the virtue-blk driver only good for CDROM image files, or can we use it on HD images as well.
I gave it a go with a Mac OS 9 image file, and was able to boot over the Trampoline into the Mac OS Nano Kernel, but it couldn’t find the System Folder, so booting halted with the ? Icon.
Not surprising, I think we would need a Openbios based ’NDRV’ for the virtio-blk device.
Booting the Trampoline was much faster with this driver, at least 2-3x.
On 17/10/2018 16:28, Jd Lyons via OpenBIOS wrote:
That’s pretty cool Mark, is the virtue-blk driver only good for CDROM image files, or can we use it on HD images as well.
I gave it a go with a Mac OS 9 image file, and was able to boot over the Trampoline into the Mac OS Nano Kernel, but it couldn’t find the System Folder, so booting halted with the ? Icon.
Not surprising, I think we would need a Openbios based ’NDRV’ for the virtio-blk device.
Yeah that's just about it - since the native OS doesn't have a virtio-blk-pci driver then someone would need to write an NDRV similar to the way in which Ben wrote the QEMU VGA driver. Don't suppose you know anyone with any MacOS 9 development experience who might be interested?
Booting the Trampoline was much faster with this driver, at least 2-3x.
Glad that you found it useful :)
ATB,
Mark.
On Wed, 17 Oct 2018, Mark Cave-Ayland wrote:
Yeah that's just about it - since the native OS doesn't have a virtio-blk-pci driver then someone would need to write an NDRV similar to the way in which Ben wrote the QEMU VGA driver. Don't suppose you know anyone with any MacOS 9 development experience who might be interested?
I think Mac-On-Linux (or some other Mac virtualisation software, I don't know these too much) has some drivers which are not virtio but a similar paravirtualised driver that probably could be adapted. This may be easier than writing a driver from scratch as the framework is already there, only the way to talk to the hypervisor should be replaced (or the way it works on MOL could be implemented in QEMU whichever is simpler). Not sure how much work this would be, just an idea if someone wants to take a look and try.
Regards, BALATON Zoltan
On Oct 17, 2018, at 5:05 PM, Mark Cave-Ayland mark.cave-ayland@ilande.co.uk wrote:
On 17/10/2018 16:28, Jd Lyons via OpenBIOS wrote:
That’s pretty cool Mark, is the virtue-blk driver only good for CDROM image files, or can we use it on HD images as well.
I gave it a go with a Mac OS 9 image file, and was able to boot over the Trampoline into the Mac OS Nano Kernel, but it couldn’t find the System Folder, so booting halted with the ? Icon.
Not surprising, I think we would need a Openbios based ’NDRV’ for the virtio-blk device.
Yeah that's just about it - since the native OS doesn't have a virtio-blk-pci driver then someone would need to write an NDRV similar to the way in which Ben wrote the QEMU VGA driver. Don't suppose you know anyone with any MacOS 9 development experience who might be interested?
Booting the Trampoline was much faster with this driver, at least 2-3x.
Glad that you found it useful :)
ATB,
Mark.
Unfortunately Apple’s documentation on ‘NDRV’s is next to useless, tho there is some code for a SCSI disk driver in the PCI DDK 3.0. so that’s something.
I’ve hacked together some video card ‘NDRV’s for Radeon chips used in Mac’s that didn’t natively support OS 9, but I had the code from OS X to work with.
I know some skilled coders for the classic Mac OS, so maybe one day we can get something like this going. I know there was some effort to get a virtio driver for OS X, so maybe someone has that code around.