On 06/12/19 11:42, Sam Eiderman wrote:Using fw_cfg, supply logical CHS values directly from QEMU to the BIOS.
Non-standard logical geometries break under QEMU.
A virtual disk which contains an operating system which depends on
logical geometries (consistent values being reported from BIOS INT13
AH=08) will most likely break under QEMU/SeaBIOS if it has non-standard
logical geometries - for example 56 SPT (sectors per track).
No matter what QEMU will report - SeaBIOS, for large enough disks - will
use LBA translation, which will report 63 SPT instead.
In addition we cannot force SeaBIOS to rely on physical geometries at
all. A virtio-blk-pci virtual disk with 255 phyiscal heads cannot
report more than 16 physical heads when moved to an IDE controller,
since the ATA spec allows a maximum of 16 heads - this is an artifact of
virtualization.
By supplying the logical geometries directly we are able to support such
"exotic" disks.
We serialize this information in a similar way to the "bootorder"
interface.
The fw_cfg entry is "bootdevices" and it serializes a struct.
At the moment the struct holds the values of logical CHS values but it
can be expanded easily due to the extendable ABI implemented.
(In the future, we can pass the bootindex through "bootdevices" instead
"bootorder" - unifying all bootdevice information in one fw_cfg value)
I would disagree with that. UEFI guest firmware doesn't seem to have anyuse for this new type of information ("logical CHS values"), so thecurrent interface (the "bootorder" fw_cfg file) should continue to work.The ArmVirtQemu and OVMF platform firmwares (built from the edk2project, and bundled with QEMU 4.1+) implement some serious parsing andprocessing for "bootorder”.
I agree, I didn’t mean to say that “bootdevices" will replace “bootorder”,
they will have to reside side by side.
I just meant to emphasis that bootorder is not extendible - adding more disk
specific fields other than bootorder (that for some platforms will be unused)
is not possible. “bootdevices” will work for LCHS, if another entry has to be
passed - it can be added to “bootdevice”.
Migrating “bootorder” into a different fw_cfg value is a tedious, probably not
worth it, effort.
Independently, another comment:The PV interface through fw_cfg could have also been implemented using
device specific keys, e.g.: "/etc/bootdevice/%s/logical_geometry" where
%s is the device name QEMU produces - but this implementation would
require much more code refactoring, both in QEMU and SeaBIOS, so the
current implementation was chosen.
Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Reviewed-by: Arbel Moshe <arbel.moshe@oracle.com>
Signed-off-by: Sam Eiderman <shmuel.eiderman@oracle.com>
---
bootdevice.c | 42 ++++++++++++++++++++++++++++++++++++++++++
hw/nvram/fw_cfg.c | 14 +++++++++++---
include/sysemu/sysemu.h | 1 +
3 files changed, 54 insertions(+), 3 deletions(-)
diff --git a/bootdevice.c b/bootdevice.c
index 2b12fb85a4..84c2a83f25 100644
--- a/bootdevice.c
+++ b/bootdevice.c
@@ -405,3 +405,45 @@ void del_boot_device_lchs(DeviceState *dev, const char *suffix)
}
}
}
+
+typedef struct QEMU_PACKED BootDeviceEntrySerialized {
+ /* Do not change field order - add new fields below */
+ uint32_t lcyls;
+ uint32_t lheads;
+ uint32_t lsecs;
+} BootDeviceEntrySerialized;
+
+/* Serialized as: struct size (4) + (device name\0 + device struct) x devices */
+char *get_boot_devices_info(size_t *size)
+{
+ FWLCHSEntry *i;
+ BootDeviceEntrySerialized s;
+ size_t total = 0;
+ char *list = NULL;
+
+ list = g_malloc0(sizeof(uint32_t));
+ *((uint32_t *)list) = (uint32_t)sizeof(s);
+ total = sizeof(uint32_t);
+
+ QTAILQ_FOREACH(i, &fw_lchs, link) {
+ char *bootpath;
+ size_t len;
+
+ bootpath = get_boot_device_path(i->dev, false, i->suffix);
+ s.lcyls = i->lcyls;
+ s.lheads = i->lheads;
+ s.lsecs = i->lsecs;
You should document the endianness of the fields inBootDeviceEntrySerialized, and then call byte order conversion functionshere accordingly (most probably cpu_to_le32()).As written, this code would break if you ran qemu-system-x86_64 /qemu-system-i386 (with TCG acceleration) on a big endian host.
Nice catch, thanks!
ThanksLaszlo+
+ len = strlen(bootpath) + 1;
+ list = g_realloc(list, total + len + sizeof(s));
+ memcpy(&list[total], bootpath, len);
+ memcpy(&list[total + len], &s, sizeof(s));
+ total += len + sizeof(s);
+
+ g_free(bootpath);
+ }
+
+ *size = total;
+
+ return list;
+}
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 9f7b7789bc..008b21542f 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -916,13 +916,21 @@ void *fw_cfg_modify_file(FWCfgState *s, const char *filename,
static void fw_cfg_machine_reset(void *opaque)
{
+ MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
+ FWCfgState *s = opaque;
void *ptr;
size_t len;
- FWCfgState *s = opaque;
- char *bootindex = get_boot_devices_list(&len);
+ char *buf;
- ptr = fw_cfg_modify_file(s, "bootorder", (uint8_t *)bootindex, len);
+ buf = get_boot_devices_list(&len);
+ ptr = fw_cfg_modify_file(s, "bootorder", (uint8_t *)buf, len);
g_free(ptr);
+
+ if (!mc->legacy_fw_cfg_order) {
+ buf = get_boot_devices_info(&len);
+ ptr = fw_cfg_modify_file(s, "bootdevices", (uint8_t *)buf, len);
+ g_free(ptr);
+ }
}
static void fw_cfg_machine_ready(struct Notifier *n, void *data)
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 173dfbb539..f0552006f4 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -174,6 +174,7 @@ void validate_bootdevices(const char *devices, Error **errp);
void add_boot_device_lchs(DeviceState *dev, const char *suffix,
uint32_t lcyls, uint32_t lheads, uint32_t lsecs);
void del_boot_device_lchs(DeviceState *dev, const char *suffix);
+char *get_boot_devices_info(size_t *size);
/* handler to set the boot_device order for a specific type of MachineClass */
typedef void QEMUBootSetHandler(void *opaque, const char *boot_order,