On 12 Jun 2019, at 15:27, Laszlo Ersek lersek@redhat.com wrote:
On 06/12/19 11:42, Sam Eiderman wrote:
Using fw_cfg, supply logical CHS values directly from QEMU to the BIOS.
Non-standard logical geometries break under QEMU.
A virtual disk which contains an operating system which depends on logical geometries (consistent values being reported from BIOS INT13 AH=08) will most likely break under QEMU/SeaBIOS if it has non-standard logical geometries - for example 56 SPT (sectors per track). No matter what QEMU will report - SeaBIOS, for large enough disks - will use LBA translation, which will report 63 SPT instead.
In addition we cannot force SeaBIOS to rely on physical geometries at all. A virtio-blk-pci virtual disk with 255 phyiscal heads cannot report more than 16 physical heads when moved to an IDE controller, since the ATA spec allows a maximum of 16 heads - this is an artifact of virtualization.
By supplying the logical geometries directly we are able to support such "exotic" disks.
We serialize this information in a similar way to the "bootorder" interface. The fw_cfg entry is "bootdevices" and it serializes a struct. At the moment the struct holds the values of logical CHS values but it can be expanded easily due to the extendable ABI implemented.
(In the future, we can pass the bootindex through "bootdevices" instead "bootorder" - unifying all bootdevice information in one fw_cfg value)
I would disagree with that. UEFI guest firmware doesn't seem to have any use for this new type of information ("logical CHS values"), so the current interface (the "bootorder" fw_cfg file) should continue to work. The ArmVirtQemu and OVMF platform firmwares (built from the edk2 project, and bundled with QEMU 4.1+) implement some serious parsing and processing for "bootorder”.
I agree, I didn’t mean to say that “bootdevices" will replace “bootorder”, they will have to reside side by side. I just meant to emphasis that bootorder is not extendible - adding more disk specific fields other than bootorder (that for some platforms will be unused) is not possible. “bootdevices” will work for LCHS, if another entry has to be passed - it can be added to “bootdevice”. Migrating “bootorder” into a different fw_cfg value is a tedious, probably not worth it, effort.
Independently, another comment:
The PV interface through fw_cfg could have also been implemented using device specific keys, e.g.: "/etc/bootdevice/%s/logical_geometry" where %s is the device name QEMU produces - but this implementation would require much more code refactoring, both in QEMU and SeaBIOS, so the current implementation was chosen.
Reviewed-by: Karl Heubaum karl.heubaum@oracle.com Reviewed-by: Arbel Moshe arbel.moshe@oracle.com Signed-off-by: Sam Eiderman shmuel.eiderman@oracle.com
bootdevice.c | 42 ++++++++++++++++++++++++++++++++++++++++++ hw/nvram/fw_cfg.c | 14 +++++++++++--- include/sysemu/sysemu.h | 1 + 3 files changed, 54 insertions(+), 3 deletions(-)
diff --git a/bootdevice.c b/bootdevice.c index 2b12fb85a4..84c2a83f25 100644 --- a/bootdevice.c +++ b/bootdevice.c @@ -405,3 +405,45 @@ void del_boot_device_lchs(DeviceState *dev, const char *suffix) } } }
+typedef struct QEMU_PACKED BootDeviceEntrySerialized {
- /* Do not change field order - add new fields below */
- uint32_t lcyls;
- uint32_t lheads;
- uint32_t lsecs;
+} BootDeviceEntrySerialized;
+/* Serialized as: struct size (4) + (device name\0 + device struct) x devices */ +char *get_boot_devices_info(size_t *size) +{
- FWLCHSEntry *i;
- BootDeviceEntrySerialized s;
- size_t total = 0;
- char *list = NULL;
- list = g_malloc0(sizeof(uint32_t));
- *((uint32_t *)list) = (uint32_t)sizeof(s);
- total = sizeof(uint32_t);
- QTAILQ_FOREACH(i, &fw_lchs, link) {
char *bootpath;
size_t len;
bootpath = get_boot_device_path(i->dev, false, i->suffix);
s.lcyls = i->lcyls;
s.lheads = i->lheads;
s.lsecs = i->lsecs;
You should document the endianness of the fields in BootDeviceEntrySerialized, and then call byte order conversion functions here accordingly (most probably cpu_to_le32()).
As written, this code would break if you ran qemu-system-x86_64 / qemu-system-i386 (with TCG acceleration) on a big endian host.
Nice catch, thanks!
Thanks Laszlo
len = strlen(bootpath) + 1;
list = g_realloc(list, total + len + sizeof(s));
memcpy(&list[total], bootpath, len);
memcpy(&list[total + len], &s, sizeof(s));
total += len + sizeof(s);
g_free(bootpath);
- }
- *size = total;
- return list;
+} diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c index 9f7b7789bc..008b21542f 100644 --- a/hw/nvram/fw_cfg.c +++ b/hw/nvram/fw_cfg.c @@ -916,13 +916,21 @@ void *fw_cfg_modify_file(FWCfgState *s, const char *filename,
static void fw_cfg_machine_reset(void *opaque) {
- MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
- FWCfgState *s = opaque; void *ptr; size_t len;
- FWCfgState *s = opaque;
- char *bootindex = get_boot_devices_list(&len);
- char *buf;
- ptr = fw_cfg_modify_file(s, "bootorder", (uint8_t *)bootindex, len);
- buf = get_boot_devices_list(&len);
- ptr = fw_cfg_modify_file(s, "bootorder", (uint8_t *)buf, len); g_free(ptr);
- if (!mc->legacy_fw_cfg_order) {
buf = get_boot_devices_info(&len);
ptr = fw_cfg_modify_file(s, "bootdevices", (uint8_t *)buf, len);
g_free(ptr);
- }
}
static void fw_cfg_machine_ready(struct Notifier *n, void *data) diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index 173dfbb539..f0552006f4 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -174,6 +174,7 @@ void validate_bootdevices(const char *devices, Error **errp); void add_boot_device_lchs(DeviceState *dev, const char *suffix, uint32_t lcyls, uint32_t lheads, uint32_t lsecs); void del_boot_device_lchs(DeviceState *dev, const char *suffix); +char *get_boot_devices_info(size_t *size);
/* handler to set the boot_device order for a specific type of MachineClass */ typedef void QEMUBootSetHandler(void *opaque, const char *boot_order,