Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas? Any ideas about qemu command line syntax? May be somebody whats to implement it? :)
-- Gleb.
Am 11.10.2010 12:18, schrieb Gleb Natapov:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?)
Floppy? Yes, I think we do.
and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
If we had a qdev ID for all devices (which I think we should have anyway), would this work or is a string not really handy enough?
Kevin
On Mon, Oct 11, 2010 at 12:32:48PM +0200, Kevin Wolf wrote:
Am 11.10.2010 12:18, schrieb Gleb Natapov:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?)
Floppy? Yes, I think we do.
and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
If we had a qdev ID for all devices (which I think we should have anyway), would this work or is a string not really handy enough?
Given qdev ID how seabios knows what device it corresponds to?
-- Gleb.
Am 11.10.2010 12:43, schrieb Gleb Natapov:
On Mon, Oct 11, 2010 at 12:32:48PM +0200, Kevin Wolf wrote:
Am 11.10.2010 12:18, schrieb Gleb Natapov:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?)
Floppy? Yes, I think we do.
and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
If we had a qdev ID for all devices (which I think we should have anyway), would this work or is a string not really handy enough?
Given qdev ID how seabios knows what device it corresponds to?
Right, somehow I assumed that SeaBIOS already has some information about disks, but now I see that this is exactly the problem you're talking about. My suggestion wasn't really helpful then.
I think what you described is more or less the only way to do it then.
Kevin
Hi,
Floppy? Yes, I think we do.
And *one* floppy controllers can actually have *two* drives connected, although booting from 'b' doesn't work IIRC.
and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
If we had a qdev ID for all devices (which I think we should have anyway), would this work or is a string not really handy enough?
I think we'll need support for that in all drivers supporting boot anyway, i.e. have virtio-blk-pci register a boot edd when configured that way. Question is how to configure this. We could attach the boot index to either the blockdev or the device, i.e.
-blockdev foo,bootindex=1
or
-device virtio-blk-pci,bootindex=1
The latter looks more useful to me, boot order is guest state imho, also it might expand to PXE booting nicely, i.e.
-device e1000,bootindex=2
Which turns up the question how this plays with option roms. seabios should be able to order at pci device level at least when booting via (pci) option rom. OK for nics. Booting from a scsi disk with id != 0 using the lsi rom is probably impossible though.
What about non-pci option roms? The one used for -kernel for example?
cheers, Gerd
On Mon, Oct 11, 2010 at 02:07:14PM +0200, Gerd Hoffmann wrote:
Hi,
Floppy? Yes, I think we do.
And *one* floppy controllers can actually have *two* drives connected, although booting from 'b' doesn't work IIRC.
and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
If we had a qdev ID for all devices (which I think we should have anyway), would this work or is a string not really handy enough?
I think we'll need support for that in all drivers supporting boot anyway, i.e. have virtio-blk-pci register a boot edd when configured that way. Question is how to configure this. We could attach the boot index to either the blockdev or the device, i.e.
-blockdev foo,bootindex=1
or
-device virtio-blk-pci,bootindex=1
The latter looks more useful to me, boot order is guest state imho, also it might expand to PXE booting nicely, i.e.
-device e1000,bootindex=2
Yes, boot order is a guest sate managed by BIOS on real HW.
Which turns up the question how this plays with option roms. seabios should be able to order at pci device level at least when booting via (pci) option rom. OK for nics. Booting from a scsi disk with id != 0 using the lsi rom is probably impossible though.
What about non-pci option roms? The one used for -kernel for example?
-option-rom rom.bin,bootindex=3?
We can pass boot index along with option rom via fw_cfg interface.
-- Gleb.
On 10/11/2010 07:16 AM, Gleb Natapov wrote:
On Mon, Oct 11, 2010 at 02:07:14PM +0200, Gerd Hoffmann wrote:
Hi,
Floppy? Yes, I think we do.
And *one* floppy controllers can actually have *two* drives connected, although booting from 'b' doesn't work IIRC.
and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
If we had a qdev ID for all devices (which I think we should have anyway), would this work or is a string not really handy enough?
I think we'll need support for that in all drivers supporting boot anyway, i.e. have virtio-blk-pci register a boot edd when configured that way. Question is how to configure this. We could attach the boot index to either the blockdev or the device, i.e.
-blockdev foo,bootindex=1
or
-device virtio-blk-pci,bootindex=1
The latter looks more useful to me, boot order is guest state imho, also it might expand to PXE booting nicely, i.e.
-device e1000,bootindex=2
Yes, boot order is a guest sate managed by BIOS on real HW.
It's not that simple. On advanced platforms, the boot order can be stored outside of the BIOS. For instance, the boot order is actually stored in the IMM on certain IBM platforms and the BIOS queries the IMM for the boot order. This allows out-of-band management tools to alter the boot order.
This is more or less what we're looking for here. The BIOS should be able to query and modify the boot order but this is something that ideally belongs in QEMU.
Which turns up the question how this plays with option roms. seabios should be able to order at pci device level at least when booting via (pci) option rom. OK for nics. Booting from a scsi disk with id != 0 using the lsi rom is probably impossible though.
What about non-pci option roms? The one used for -kernel for example?
-option-rom rom.bin,bootindex=3?
We can pass boot index along with option rom via fw_cfg interface.
If the option rom is just hijacking int19, then there is no meaningful order you can give it.
Regards,
Anthony Liguori
-- Gleb.
SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
On Mon, Oct 11, 2010 at 02:48:48PM -0500, Anthony Liguori wrote:
On 10/11/2010 07:16 AM, Gleb Natapov wrote:
On Mon, Oct 11, 2010 at 02:07:14PM +0200, Gerd Hoffmann wrote:
Hi,
Floppy? Yes, I think we do.
And *one* floppy controllers can actually have *two* drives connected, although booting from 'b' doesn't work IIRC.
and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
If we had a qdev ID for all devices (which I think we should have anyway), would this work or is a string not really handy enough?
I think we'll need support for that in all drivers supporting boot anyway, i.e. have virtio-blk-pci register a boot edd when configured that way. Question is how to configure this. We could attach the boot index to either the blockdev or the device, i.e.
-blockdev foo,bootindex=1
or
-device virtio-blk-pci,bootindex=1
The latter looks more useful to me, boot order is guest state imho, also it might expand to PXE booting nicely, i.e.
-device e1000,bootindex=2
Yes, boot order is a guest sate managed by BIOS on real HW.
It's not that simple. On advanced platforms, the boot order can be stored outside of the BIOS. For instance, the boot order is actually stored in the IMM on certain IBM platforms and the BIOS queries the IMM for the boot order. This allows out-of-band management tools to alter the boot order.
Interesting.
This is more or less what we're looking for here. The BIOS should be able to query and modify the boot order but this is something that ideally belongs in QEMU.
Which turns up the question how this plays with option roms. seabios should be able to order at pci device level at least when booting via (pci) option rom. OK for nics. Booting from a scsi disk with id != 0 using the lsi rom is probably impossible though.
What about non-pci option roms? The one used for -kernel for example?
-option-rom rom.bin,bootindex=3?
We can pass boot index along with option rom via fw_cfg interface.
If the option rom is just hijacking int19, then there is no meaningful order you can give it.
No boot rom should do that. extboot wreaks havoc when it is used. And since virtio is now supported by bios there is no reason to use it. Whoever needs scsi boot should add it to seabios too.
-- Gleb.
On 10/11/2010 02:59 PM, Gleb Natapov wrote:
No boot rom should do that. extboot wreaks havoc when it is used. And since virtio is now supported by bios there is no reason to use it.
You don't really have a choice. You could be doing hardware passthrough and the ROM on the card may hijack int19.
Whoever needs scsi boot should add it to seabios too.
I don't disagree.
I think the best thing to do is to let SeaBIOS create a boot order table that contains descriptive information and then advertise that to QEMU.
QEMU can then try to associate the list of bootable devices with it's own set of devices and select a preferred order that it can then give back to SeaBIOS. SeaBIOS can then present that list to the user for additional refinement.
Regards,
Anthony Liguori
-- Gleb.
On Mon, Oct 11, 2010 at 03:30:21PM -0500, Anthony Liguori wrote:
On 10/11/2010 02:59 PM, Gleb Natapov wrote:
No boot rom should do that. extboot wreaks havoc when it is used. And since virtio is now supported by bios there is no reason to use it.
You don't really have a choice. You could be doing hardware passthrough and the ROM on the card may hijack int19.
Then this particular HW would be broken on real HW too and will not respect BIOS settings. But the code we provide should work properly.
Whoever needs scsi boot should add it to seabios too.
I don't disagree.
I think the best thing to do is to let SeaBIOS create a boot order table that contains descriptive information and then advertise that to QEMU.
What for? Why this step is needed?
QEMU can then try to associate the list of bootable devices with it's own set of devices and select a preferred order that it can then give back to SeaBIOS. SeaBIOS can then present that list to the user for additional refinement.
Why not skip your first step and let QEMU create boot order list and pass it into Seabios. If menu=on option is present user will be able to override the default from Seabios.
-- Gleb.
Gleb Natapov wrote:
Why not skip your first step and let QEMU create boot order list and pass it into Seabios.
Because it would mean duplicating a bunch of BBS code in QEMU.
//Peter
On Mon, Oct 11, 2010 at 10:46:07PM +0200, Peter Stuge wrote:
Gleb Natapov wrote:
Why not skip your first step and let QEMU create boot order list and pass it into Seabios.
Because it would mean duplicating a bunch of BBS code in QEMU.
Why? Elaborate please.
-- Gleb.
On 10/11/2010 03:36 PM, Gleb Natapov wrote:
On Mon, Oct 11, 2010 at 03:30:21PM -0500, Anthony Liguori wrote:
On 10/11/2010 02:59 PM, Gleb Natapov wrote:
No boot rom should do that. extboot wreaks havoc when it is used. And since virtio is now supported by bios there is no reason to use it.
You don't really have a choice. You could be doing hardware passthrough and the ROM on the card may hijack int19.
Then this particular HW would be broken on real HW too and will not respect BIOS settings. But the code we provide should work properly.
Whoever needs scsi boot should add it to seabios too.
I don't disagree.
I think the best thing to do is to let SeaBIOS create a boot order table that contains descriptive information and then advertise that to QEMU.
What for? Why this step is needed?
QEMU can then try to associate the list of bootable devices with it's own set of devices and select a preferred order that it can then give back to SeaBIOS. SeaBIOS can then present that list to the user for additional refinement.
Why not skip your first step and let QEMU create boot order list and pass it into Seabios. If menu=on option is present user will be able to override the default from Seabios.
Because SeaBIOS is definitive and QEMU is not.
We can ask SeaBIOS to boot from SCSI LUN 3 on PCI address X.Y.Z but that doesn't mean that it can figure out what that means. If it can't, how do we communicate that to the user? If SeaBIOS communicates its list to QEMU then we can at least display that list in the monitor in the same way that it's displayed to the guest. That means that we can reorder in the monitor and potentially can persistent the boot device list in a more meaningful way.
Regards,
Anthony Liguori
-- Gleb.
On Mon, Oct 11, 2010 at 03:50:08PM -0500, Anthony Liguori wrote:
On 10/11/2010 03:36 PM, Gleb Natapov wrote:
On Mon, Oct 11, 2010 at 03:30:21PM -0500, Anthony Liguori wrote:
On 10/11/2010 02:59 PM, Gleb Natapov wrote:
No boot rom should do that. extboot wreaks havoc when it is used. And since virtio is now supported by bios there is no reason to use it.
You don't really have a choice. You could be doing hardware passthrough and the ROM on the card may hijack int19.
Then this particular HW would be broken on real HW too and will not respect BIOS settings. But the code we provide should work properly.
Whoever needs scsi boot should add it to seabios too.
I don't disagree.
I think the best thing to do is to let SeaBIOS create a boot order table that contains descriptive information and then advertise that to QEMU.
What for? Why this step is needed?
QEMU can then try to associate the list of bootable devices with it's own set of devices and select a preferred order that it can then give back to SeaBIOS. SeaBIOS can then present that list to the user for additional refinement.
Why not skip your first step and let QEMU create boot order list and pass it into Seabios. If menu=on option is present user will be able to override the default from Seabios.
Because SeaBIOS is definitive and QEMU is not.
We can ask SeaBIOS to boot from SCSI LUN 3 on PCI address X.Y.Z but that doesn't mean that it can figure out what that means. If it
It can figure exactly what that means. This defines boot disk in non-ambiguous way.
can't, how do we communicate that to the user? If SeaBIOS
Boot will fail, user will notice. Actually there is an idea to notify qemu about failed boot instead of just halt in bios. But, in reality, since qemu and seabios are released together this situation should never happen. If device is created by qemu it should be enumerated by seabios. Otherwise other things may not work properly too.
communicates its list to QEMU then we can at least display that list in the monitor in the same way that it's displayed to the guest.
You can display it on the monitor. Qemu has all necessary info. Qemu creates it and pass into seabios after all.
That means that we can reorder in the monitor and potentially can persistent the boot device list in a more meaningful way.
You can reorder them in the monitor between boots just fine without passing any info from Seabios to QEMU.
Seabios does not create any bootable devices, it just discovers whatever qemu created, so seabios has nothing interesting to tell to qemu.
-- Gleb.
Gleb Natapov wrote:
Seabios does not create any bootable devices, it just discovers whatever qemu created, so seabios has nothing interesting to tell to qemu.
SeaBIOS can add boot devices as part of option ROM init, as was mentioned. QEMU may or may not be involved in this, but in any case the complete BBS must then be implemented also in QEMU..
//Peter
On 10/11/2010 04:16 PM, Peter Stuge wrote:
Gleb Natapov wrote:
Seabios does not create any bootable devices, it just discovers whatever qemu created, so seabios has nothing interesting to tell to qemu.
SeaBIOS can add boot devices as part of option ROM init, as was mentioned. QEMU may or may not be involved in this, but in any case the complete BBS must then be implemented also in QEMU..
Somebody has to be responsible for enumerating all of the devices that can be booted, communicated it to someone else, and letting that other party reorder things while keeping the former party informed. The options are:
1) QEMU let's user choose device boot order based on something that makes sense to QEMU and it's users 2) QEMU creates a list of device boot order that it prefers and communicates to SeaBIOS. If this is to be authoritative, QEMU must generate a list that follows the BBS 3) SeaBIOS then allows a user to reorder boot devices 4) SeaBIOS tells QEMU the new boot order
Or:
1) QEMU let's user choose device boot order based on something that makes sense to QEMU and it's users 2) SeaBIOS creates a list of bootable devices based on the BBS and anything else it thinks it can boot from 3) SeaBIOS passes this list to QEMU and asks QEMU to adjust ordering 4) QEMU adjusts ordering according to (1) and tells SeaBIOS 5) SeaBIOS then allows a user to reorder boot devices 6) SeaBIOS tells QEMU the new boot order
It may seem like the second option is more complicated, but I think step 2 in the first option is going to be prohibitively difficult and really doesn't fit SeaBIOS very well as bare metal BIOS. The second option is more akin to how this would work on bare metal.
Regards,
Anthony Liguori
//Peter
SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios
On Mon, Oct 11, 2010 at 05:32:17PM -0500, Anthony Liguori wrote:
Somebody has to be responsible for enumerating all of the devices that can be booted, communicated it to someone else, and letting that other party reorder things while keeping the former party informed. The options are:
- QEMU let's user choose device boot order based on something that
makes sense to QEMU and it's users 2) QEMU creates a list of device boot order that it prefers and communicates to SeaBIOS. If this is to be authoritative, QEMU must generate a list that follows the BBS 3) SeaBIOS then allows a user to reorder boot devices 4) SeaBIOS tells QEMU the new boot order
Or:
- QEMU let's user choose device boot order based on something that
makes sense to QEMU and it's users 2) SeaBIOS creates a list of bootable devices based on the BBS and anything else it thinks it can boot from 3) SeaBIOS passes this list to QEMU and asks QEMU to adjust ordering 4) QEMU adjusts ordering according to (1) and tells SeaBIOS 5) SeaBIOS then allows a user to reorder boot devices 6) SeaBIOS tells QEMU the new boot order
How about:
1) QEMU let's user choose device boot order based on something that makes sense to QEMU and it's users. It places this info in fw_cfg by providing a list of "path names" for each device. 2) SeaBIOS creates a list of bootable devices based on the BBS and anything else it thinks it can boot from. It prioritizes devices in this list if their "path name" is found in the fw_cfg boot order list provided by qemu. 3) SeaBIOS then allows a user to reorder boot devices.
I'm not sure why step 4 (SeaBIOS tells QEMU the new boot order) is needed.
It may seem like the second option is more complicated, but I think step 2 in the first option is going to be prohibitively difficult and really doesn't fit SeaBIOS very well as bare metal BIOS. The second option is more akin to how this would work on bare metal.
I don't think step 2 will be that hard. There aren't that many ways SeaBIOS can boot a machine. I think we can label them all with unique names (eg, "ata@01:13.0@0", "usb@1234:5678", "virtio@01:13.0", "rom@01:13.0", etc). The qemu list doesn't need to be authoritative - should SeaBIOS know how to boot something that qemu doesn't yet know about, then users will need to use the boot menu until qemu is updated.
-Kevin
On Mon, Oct 11, 2010 at 07:54:30PM -0400, Kevin O'Connor wrote:
On Mon, Oct 11, 2010 at 05:32:17PM -0500, Anthony Liguori wrote:
Somebody has to be responsible for enumerating all of the devices that can be booted, communicated it to someone else, and letting that other party reorder things while keeping the former party informed. The options are:
- QEMU let's user choose device boot order based on something that
makes sense to QEMU and it's users 2) QEMU creates a list of device boot order that it prefers and communicates to SeaBIOS. If this is to be authoritative, QEMU must generate a list that follows the BBS 3) SeaBIOS then allows a user to reorder boot devices 4) SeaBIOS tells QEMU the new boot order
Or:
- QEMU let's user choose device boot order based on something that
makes sense to QEMU and it's users 2) SeaBIOS creates a list of bootable devices based on the BBS and anything else it thinks it can boot from 3) SeaBIOS passes this list to QEMU and asks QEMU to adjust ordering 4) QEMU adjusts ordering according to (1) and tells SeaBIOS 5) SeaBIOS then allows a user to reorder boot devices 6) SeaBIOS tells QEMU the new boot order
How about:
- QEMU let's user choose device boot order based on something that makes sense to QEMU and it's users. It places this info in fw_cfg by providing a list of "path names" for each device.
- SeaBIOS creates a list of bootable devices based on the BBS and anything else it thinks it can boot from. It prioritizes devices in this list if their "path name" is found in the fw_cfg boot order list provided by qemu.
- SeaBIOS then allows a user to reorder boot devices.
I'm not sure why step 4 (SeaBIOS tells QEMU the new boot order) is needed.
Exactly. If user wants to boot from non-default boot device once it will use Seabios boot menu. If she wants to change boot order permanently she will use nice, mouse driven virt-manager GUI.
Honestly, how often do you use Seabios boot menu? I only use it when I need to boot from device that cannot be made bootable because of current qemu/seabios brokenness. After that will be fixed I can't see myself using it anymore.
It may seem like the second option is more complicated, but I think step 2 in the first option is going to be prohibitively difficult and really doesn't fit SeaBIOS very well as bare metal BIOS. The second option is more akin to how this would work on bare metal.
I don't think step 2 will be that hard. There aren't that many ways SeaBIOS can boot a machine. I think we can label them all with unique names (eg, "ata@01:13.0@0", "usb@1234:5678", "virtio@01:13.0", "rom@01:13.0", etc). The qemu list doesn't need to be authoritative - should SeaBIOS know how to boot something that qemu doesn't yet know about, then users will need to use the boot menu until qemu is updated.
Agree. What Anthony propose looks overcomplicated without any benefits.
-- Gleb.
Kevin O'Connor wrote:
- QEMU let's user choose device boot order based on something that
makes sense to QEMU and it's users 2) SeaBIOS creates a list of bootable devices based on the BBS and anything else it thinks it can boot from 3) SeaBIOS passes this list to QEMU and asks QEMU to adjust ordering 4) QEMU adjusts ordering according to (1) and tells SeaBIOS 5) SeaBIOS then allows a user to reorder boot devices 6) SeaBIOS tells QEMU the new boot order
I think that this proposal from Anthony is the thorough good way to do it.
How about:
- QEMU let's user choose device boot order based on something that makes sense to QEMU and it's users. It places this info in fw_cfg by providing a list of "path names" for each device.
- SeaBIOS creates a list of bootable devices based on the BBS and anything else it thinks it can boot from. It prioritizes devices in this list if their "path name" is found in the fw_cfg boot order list provided by qemu.
I think it may be more difficult for QEMU to predict how SeaBIOS will enumerate boot devices and what paths they will have than for QEMU to work with the boot devices that were found by SeaBIOS after the fact.
- SeaBIOS then allows a user to reorder boot devices.
I'm not sure why step 4 (SeaBIOS tells QEMU the new boot order) is needed.
To save the boot order in NV storage.
//Peter
On Tue, Oct 12, 2010 at 10:11:09AM +0200, Peter Stuge wrote:
How about:
- QEMU let's user choose device boot order based on something that makes sense to QEMU and it's users. It places this info in fw_cfg by providing a list of "path names" for each device.
- SeaBIOS creates a list of bootable devices based on the BBS and anything else it thinks it can boot from. It prioritizes devices in this list if their "path name" is found in the fw_cfg boot order list provided by qemu.
I think it may be more difficult for QEMU to predict how SeaBIOS will enumerate boot devices and what paths they will have than for QEMU to work with the boot devices that were found by SeaBIOS after the fact.
Why do you think so? Path to a device is HW topology thing. It does not change no matter who enumerates it QEMU or Seabios. The same exact problem exists between a BIOS and an OS when EDD describes storage devices.
- SeaBIOS then allows a user to reorder boot devices.
I'm not sure why step 4 (SeaBIOS tells QEMU the new boot order) is needed.
To save the boot order in NV storage.
We do not have NV storage and we'd rather not introduce it. In virtualization user configs VM through virt-manager or using qemu command line directly. Bios reads config using fw_cgf interface and obeys it.
-- Gleb.
On Mon, Oct 11, 2010 at 11:16:38PM +0200, Peter Stuge wrote:
Gleb Natapov wrote:
Seabios does not create any bootable devices, it just discovers whatever qemu created, so seabios has nothing interesting to tell to qemu.
SeaBIOS can add boot devices as part of option ROM init, as was mentioned. QEMU may or may not be involved in this, but in any case the complete BBS must then be implemented also in QEMU..
ROMs are not appear from nowhere. They are either loaded from pci card or added by QEMU. In both cases QEMU is well aware about entity that can boot.
-- Gleb.
On 10/11/2010 01:30 PM, Anthony Liguori wrote:
On 10/11/2010 02:59 PM, Gleb Natapov wrote:
No boot rom should do that. extboot wreaks havoc when it is used. And since virtio is now supported by bios there is no reason to use it.
You don't really have a choice. You could be doing hardware passthrough and the ROM on the card may hijack int19.
The BBS standard actually documents how to deal with that -- it pretty much works out to "let the card initialize, then see if it mucked with int19, and then put int19 back... if we want to run that card, then we invoke the int19 that the card set up."
Whoever needs scsi boot should add it to seabios too.
I don't disagree.
I think the best thing to do is to let SeaBIOS create a boot order table that contains descriptive information and then advertise that to QEMU.
QEMU can then try to associate the list of bootable devices with it's own set of devices and select a preferred order that it can then give back to SeaBIOS. SeaBIOS can then present that list to the user for additional refinement.
Really, this kind of comes down to having a data structure that anything (Qemu, SeaBIOS and if needed the guest OS) can read and modify as needed.
-hpa
H. Peter Anvin wrote:
On 10/11/2010 01:30 PM, Anthony Liguori wrote:
On 10/11/2010 02:59 PM, Gleb Natapov wrote:
No boot rom should do that. extboot wreaks havoc when it is used. And since virtio is now supported by bios there is no reason to use it.
You don't really have a choice. You could be doing hardware passthrough and the ROM on the card may hijack int19.
The BBS standard actually documents how to deal with that -- it pretty much works out to "let the card initialize, then see if it mucked with int19, and then put int19 back... if we want to run that card, then we invoke the int19 that the card set up."
The BIOS Boot Specification, Version 1.01 from January 11, 1996 seems not to recommend this:
3.4 Legacy IPL Devices
"Legacy IPL devices will be allowed to take control of the system (via hooking interrupts) in both Legacy and PnP systems. The Plug and Play BIOS specification recommends that Legacy devices that hook a bootstrap interrupt such as INT 19h, 18h, or 13h have the interrupt re-captured by the BIOS. This is not done because grabbing an interrupt vector back after a device has hooked it can produce unpredictable results. Further, by allowing the card to take control, the behavior of these Legacy cards will be the same on both PnP and Legacy machines."
6.8 Notes on the POST Process
"The Plug and Play BIOS Specification says that if a Legacy IPL device's option ROM captures INT 18h or INT 19h, the BIOS should save this vector and then restore the original one put there by the BIOS. The BIOS Boot Specification deviates from this in that these vectors are not recaptured after each Legacy option ROM returns from initialization. That would be considered unsafe."
Sebastian
On 10/11/2010 02:41 PM, Sebastian Herbszt wrote:
H. Peter Anvin wrote:
On 10/11/2010 01:30 PM, Anthony Liguori wrote:
On 10/11/2010 02:59 PM, Gleb Natapov wrote:
No boot rom should do that. extboot wreaks havoc when it is used. And since virtio is now supported by bios there is no reason to use it.
You don't really have a choice. You could be doing hardware passthrough and the ROM on the card may hijack int19.
The BBS standard actually documents how to deal with that -- it pretty much works out to "let the card initialize, then see if it mucked with int19, and then put int19 back... if we want to run that card, then we invoke the int19 that the card set up."
The BIOS Boot Specification, Version 1.01 from January 11, 1996 seems not to recommend this:
3.4 Legacy IPL Devices
"Legacy IPL devices will be allowed to take control of the system (via hooking interrupts) in both Legacy and PnP systems. The Plug and Play BIOS specification recommends that Legacy devices that hook a bootstrap interrupt such as INT 19h, 18h, or 13h have the interrupt re-captured by the BIOS. This is not done because grabbing an interrupt vector back after a device has hooked it can produce unpredictable results. Further, by allowing the card to take control, the behavior of these Legacy cards will be the same on both PnP and Legacy machines."
6.8 Notes on the POST Process
"The Plug and Play BIOS Specification says that if a Legacy IPL device's option ROM captures INT 18h or INT 19h, the BIOS should save this vector and then restore the original one put there by the BIOS. The BIOS Boot Specification deviates from this in that these vectors are not recaptured after each Legacy option ROM returns from initialization. That would be considered unsafe."
Sorry, you're right -- I confused the PNPBIOS spec with the BBS spec (and compounded the error by correctly remembering that BBS overrides PNPBIOS).
-hpa
On Mon, Oct 11, 2010 at 02:15:26PM -0700, H. Peter Anvin wrote:
I don't disagree.
I think the best thing to do is to let SeaBIOS create a boot order table that contains descriptive information and then advertise that to QEMU.
QEMU can then try to associate the list of bootable devices with it's own set of devices and select a preferred order that it can then give back to SeaBIOS. SeaBIOS can then present that list to the user for additional refinement.
Really, this kind of comes down to having a data structure that anything (Qemu, SeaBIOS and if needed the guest OS) can read and modify as needed.
But then QEMU and seabios will have to have shared storage they can both write too. And this shared storage is part of VM now so you need to carry it around when you move your VM elsewhere.
-- Gleb.
Gleb Natapov wrote:
Really, this kind of comes down to having a data structure that anything (Qemu, SeaBIOS and if needed the guest OS) can read and modify as needed.
But then QEMU and seabios will have to have shared storage they can both write too. And this shared storage is part of VM now so you need to carry it around when you move your VM elsewhere.
Yep. Isn't BIOS stuff great?
//Peter
On 10/12/2010 01:01 AM, Gleb Natapov wrote:
On Mon, Oct 11, 2010 at 02:15:26PM -0700, H. Peter Anvin wrote:
I don't disagree.
I think the best thing to do is to let SeaBIOS create a boot order table that contains descriptive information and then advertise that to QEMU.
QEMU can then try to associate the list of bootable devices with it's own set of devices and select a preferred order that it can then give back to SeaBIOS. SeaBIOS can then present that list to the user for additional refinement.
Really, this kind of comes down to having a data structure that anything (Qemu, SeaBIOS and if needed the guest OS) can read and modify as needed.
But then QEMU and seabios will have to have shared storage they can both write too. And this shared storage is part of VM now so you need to carry it around when you move your VM elsewhere.
Yes, and it's part of real hardware, too. It's usually called "the CMOS", short for CMOS RAM.
-hpa
On Tue, Oct 12, 2010 at 09:33:16AM -0700, H. Peter Anvin wrote:
On 10/12/2010 01:01 AM, Gleb Natapov wrote:
On Mon, Oct 11, 2010 at 02:15:26PM -0700, H. Peter Anvin wrote:
I don't disagree.
I think the best thing to do is to let SeaBIOS create a boot order table that contains descriptive information and then advertise that to QEMU.
QEMU can then try to associate the list of bootable devices with it's own set of devices and select a preferred order that it can then give back to SeaBIOS. SeaBIOS can then present that list to the user for additional refinement.
Really, this kind of comes down to having a data structure that anything (Qemu, SeaBIOS and if needed the guest OS) can read and modify as needed.
But then QEMU and seabios will have to have shared storage they can both write too. And this shared storage is part of VM now so you need to carry it around when you move your VM elsewhere.
Yes, and it's part of real hardware, too. It's usually called "the CMOS", short for CMOS RAM.
On real hardware it is not shared between HW and bios. It is written/read only by BIOS. In qemu it is not persistent and generated for each qemu invocation. Previously it was used to pass config params from qemu to a bios (and some legacy params are still passed that way), but we moved to better interface for that (firmware config).
-- Gleb.
On real hardware it is shared between BIOS and the OS, actually.
"Gleb Natapov" gleb@redhat.com wrote:
On Tue, Oct 12, 2010 at 09:33:16AM -0700, H. Peter Anvin wrote:
On 10/12/2010 01:01 AM, Gleb Natapov wrote:
On Mon, Oct 11, 2010 at 02:15:26PM -0700, H. Peter Anvin wrote:
I don't disagree.
I think the best thing to do is to let SeaBIOS create a boot order table that contains descriptive information and then advertise that to QEMU.
QEMU can then try to associate the list of bootable devices with it's own set of devices and select a preferred order that it can then give back to SeaBIOS. SeaBIOS can then present that list to the user for additional refinement.
Really, this kind of comes down to having a data structure that anything (Qemu, SeaBIOS and if needed the guest OS) can read and modify as needed.
But then QEMU and seabios will have to have shared storage they can both write too. And this shared storage is part of VM now so you need to carry it around when you move your VM elsewhere.
Yes, and it's part of real hardware, too. It's usually called "the CMOS", short for CMOS RAM.
On real hardware it is not shared between HW and bios. It is written/read only by BIOS. In qemu it is not persistent and generated for each qemu invocation. Previously it was used to pass config params from qemu to a bios (and some legacy params are still passed that way), but we moved to better interface for that (firmware config).
-- Gleb.
On Tue, Oct 12, 2010 at 10:35:51AM -0700, H. Peter Anvin wrote:
On real hardware it is shared between BIOS and the OS, actually.
Guest OS can write in qemu CMOS too. But what is it useful for? Most of its content is not standard AFAIK.
"Gleb Natapov" gleb@redhat.com wrote:
On Tue, Oct 12, 2010 at 09:33:16AM -0700, H. Peter Anvin wrote:
On 10/12/2010 01:01 AM, Gleb Natapov wrote:
On Mon, Oct 11, 2010 at 02:15:26PM -0700, H. Peter Anvin wrote:
I don't disagree.
I think the best thing to do is to let SeaBIOS create a boot order table that contains descriptive information and then advertise that to QEMU.
QEMU can then try to associate the list of bootable devices with it's own set of devices and select a preferred order that it can then give back to SeaBIOS. SeaBIOS can then present that list to the user for additional refinement.
Really, this kind of comes down to having a data structure that anything (Qemu, SeaBIOS and if needed the guest OS) can read and modify as needed.
But then QEMU and seabios will have to have shared storage they can both write too. And this shared storage is part of VM now so you need to carry it around when you move your VM elsewhere.
Yes, and it's part of real hardware, too. It's usually called "the CMOS", short for CMOS RAM.
On real hardware it is not shared between HW and bios. It is written/read only by BIOS. In qemu it is not persistent and generated for each qemu invocation. Previously it was used to pass config params from qemu to a bios (and some legacy params are still passed that way), but we moved to better interface for that (firmware config).
-- Gleb.
-- Sent from my mobile phone. Please pardon any lack of formatting.
-- Gleb.
On 10/12/2010 10:41 AM, Gleb Natapov wrote:
On Tue, Oct 12, 2010 at 10:35:51AM -0700, H. Peter Anvin wrote:
On real hardware it is shared between BIOS and the OS, actually.
Guest OS can write in qemu CMOS too. But what is it useful for? Most of its content is not standard AFAIK.
This is true to some extent -- there is some standard content, and some further can be described via ACPI tables. However, my point was mostly that it is an existing model for nonvolatile storage which also works on hardware (and is vastly simpler albeit smaller in size than ESCD).
-hpa
On Tue, Oct 12, 2010 at 10:45:58AM -0700, H. Peter Anvin wrote:
On 10/12/2010 10:41 AM, Gleb Natapov wrote:
On Tue, Oct 12, 2010 at 10:35:51AM -0700, H. Peter Anvin wrote:
On real hardware it is shared between BIOS and the OS, actually.
Guest OS can write in qemu CMOS too. But what is it useful for? Most of its content is not standard AFAIK.
This is true to some extent -- there is some standard content, and some further can be described via ACPI tables. However, my point was mostly that it is an existing model for nonvolatile storage which also works on hardware (and is vastly simpler albeit smaller in size than ESCD).
And my point is why would we want nonvolatile storage for BIOS settings in qemu. It doesn't provide anything that can't be done through command line and configured nicely by virt-manager and it introduces one more file to carry around with your VM. And if the idea is to create it on the fly then it is no longer nonvolatile and is not better then fw_cfg. If we want nonvolatile storage for some reason I would agree that CMOS is good candidate for that except of its size. How much can you fit into 128 byte? Less then one tweet. -- Gleb.
On 10/12/2010 12:06 PM, Gleb Natapov wrote:
This is true to some extent -- there is some standard content, and some further can be described via ACPI tables. However, my point was mostly that it is an existing model for nonvolatile storage which also works on hardware (and is vastly simpler albeit smaller in size than ESCD).
And my point is why would we want nonvolatile storage for BIOS settings in qemu. It doesn't provide anything that can't be done through command line and configured nicely by virt-manager and it introduces one more file to carry around with your VM. And if the idea is to create it on the fly then it is no longer nonvolatile and is not better then fw_cfg. If we want nonvolatile storage for some reason I would agree that CMOS is good candidate for that except of its size. How much can you fit into 128 byte? Less then one tweet.
128 bytes isn't a hard limit; the original PC/AT actually had only 64 bytes, but the "standard" interface allows 128 bytes. However, there is a semi-standard (common extension) interface using ports 72/73 which allows 256 bytes, and quite a few chipsets have added further extensions.
The ACPI specification recognizes three interfaces as standard: PC/AT (64 bytes, even though 128 bytes is available on a lot of platforms), PIIX4 (256 bytes), and Dallas Semiconductor ("256 bytes or more"). The interface for the latter isn't well cited in the ACPI spec, but I'm guessing this is referring to the DS17885 series of chips, which can have up to 8K CMOS using a bank-switched scheme which presents 128 bytes at a time (thus accessible via only the standard 70/71 ports.)
-hpa
On 10/13/2010 12:17 PM, H. Peter Anvin wrote:
The ACPI specification recognizes three interfaces as standard: PC/AT (64 bytes, even though 128 bytes is available on a lot of platforms), PIIX4 (256 bytes), and Dallas Semiconductor ("256 bytes or more"). The interface for the latter isn't well cited in the ACPI spec, but I'm guessing this is referring to the DS17885 series of chips, which can have up to 8K CMOS using a bank-switched scheme which presents 128 bytes at a time (thus accessible via only the standard 70/71 ports.)
FWIW, the DS17885 scheme actually allows addressing up to 64K; 8K is the maximum that DS produced with this particular interface as far as I know, but there are 16 address bits available.
-hpa
On 10/13/2010 01:00 PM, H. Peter Anvin wrote:
On 10/13/2010 12:17 PM, H. Peter Anvin wrote:
The ACPI specification recognizes three interfaces as standard: PC/AT (64 bytes, even though 128 bytes is available on a lot of platforms), PIIX4 (256 bytes), and Dallas Semiconductor ("256 bytes or more"). The interface for the latter isn't well cited in the ACPI spec, but I'm guessing this is referring to the DS17885 series of chips, which can have up to 8K CMOS using a bank-switched scheme which presents 128 bytes at a time (thus accessible via only the standard 70/71 ports.)
FWIW, the DS17885 scheme actually allows addressing up to 64K; 8K is the maximum that DS produced with this particular interface as far as I know, but there are 16 address bits available.
-hpa
I think this is the relevant application note:
http://www.maxim-ic.com/app-notes/index.mvp/id/77
-hpa
On 10/11/2010 07:07 AM, Gerd Hoffmann wrote:
Hi,
Floppy? Yes, I think we do.
And *one* floppy controllers can actually have *two* drives connected, although booting from 'b' doesn't work IIRC.
and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
If we had a qdev ID for all devices (which I think we should have anyway), would this work or is a string not really handy enough?
I think we'll need support for that in all drivers supporting boot anyway, i.e. have virtio-blk-pci register a boot edd when configured that way. Question is how to configure this. We could attach the boot index to either the blockdev or the device, i.e.
-blockdev foo,bootindex=1
or
-device virtio-blk-pci,bootindex=1
The latter looks more useful to me, boot order is guest state imho, also it might expand to PXE booting nicely, i.e.
-device e1000,bootindex=2
Which turns up the question how this plays with option roms. seabios should be able to order at pci device level at least when booting via (pci) option rom. OK for nics. Booting from a scsi disk with id != 0 using the lsi rom is probably impossible though.
What about non-pci option roms? The one used for -kernel for example?
-kernel hijacks int19 so it cannot participate in any kind of boot order. It's either present (and therefore the bootable disk) or not present.
Regards,
Anthony Liguori
cheers, Gerd
On Mon, Oct 11, 2010 at 02:51:09PM -0500, Anthony Liguori wrote:
On 10/11/2010 07:07 AM, Gerd Hoffmann wrote:
Hi,
Floppy? Yes, I think we do.
And *one* floppy controllers can actually have *two* drives connected, although booting from 'b' doesn't work IIRC.
and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
If we had a qdev ID for all devices (which I think we should have anyway), would this work or is a string not really handy enough?
I think we'll need support for that in all drivers supporting boot anyway, i.e. have virtio-blk-pci register a boot edd when configured that way. Question is how to configure this. We could attach the boot index to either the blockdev or the device, i.e.
-blockdev foo,bootindex=1
or
-device virtio-blk-pci,bootindex=1
The latter looks more useful to me, boot order is guest state imho, also it might expand to PXE booting nicely, i.e.
-device e1000,bootindex=2
Which turns up the question how this plays with option roms. seabios should be able to order at pci device level at least when booting via (pci) option rom. OK for nics. Booting from a scsi disk with id != 0 using the lsi rom is probably impossible though.
What about non-pci option roms? The one used for -kernel for example?
-kernel hijacks int19 so it cannot participate in any kind of boot order. It's either present (and therefore the bootable disk) or not present.
-kernel is special enough to not care. Although it would be nice to fix it to behave like regular boot rom.
-- Gleb.
On 10/11/2010 12:51 PM, Anthony Liguori wrote:
-kernel hijacks int19 so it cannot participate in any kind of boot order. It's either present (and therefore the bootable disk) or not present.
That's a misdesign, though: it should be able to participate in BBS as a BEV.
-hpa
Am 11.10.2010 12:18, schrieb ext Gleb Natapov:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas?
I think this also applies to network booting via gPXE. Usually our VMs have 4 NICs, mixed virtio-net and PCI pass-through. 2 of the NICs shall be used for booting, even if there are hard disks or floppy disks connected. This scenario is currently almost impossible to configure.
I already posted a patch 'new parameter boot=on|off for "-net nic" and "-device" NIC devices' which should solve that problem for us. The patch is still under discussion. Of course passing detailed boot device information to SeaBIOS would be the best solution.
Any ideas about qemu command line syntax? May be somebody whats to implement it? :)
-- Gleb.
On Mon, Oct 11, 2010 at 01:16:00PM +0200, Bernhard Kohl wrote:
Am 11.10.2010 12:18, schrieb ext Gleb Natapov:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas?
I think this also applies to network booting via gPXE. Usually our VMs have 4 NICs, mixed virtio-net and PCI pass-through. 2 of the NICs shall be used for booting, even if there are hard disks or floppy disks connected. This scenario is currently almost impossible to configure.
I already posted a patch 'new parameter boot=on|off for "-net nic" and "-device" NIC devices' which should solve that problem for us. The patch is still under discussion. Of course passing detailed boot device information to SeaBIOS would be the best solution.
Yeah, forgot to specify net booting specifically. Do we emulate usb storage? If yes it can be bootable too. Sorry, I missed your patch initially, but now looking at it I do not think this is right approach. It resembles to me extboot hack. We shouldn't play games with roms to specify boot order. Bios is able to handle this by itself, we just need to help it. NIC can be specified by its bus address and may be virtual function (for SR-IOV devices ?). Seabios will be able to find what option rom it should run from there.
Any ideas about qemu command line syntax? May be somebody whats to implement it? :)
-- Gleb.
-- Gleb.
On Mon, Oct 11, 2010 at 02:08:13PM +0200, Gleb Natapov wrote:
On Mon, Oct 11, 2010 at 01:16:00PM +0200, Bernhard Kohl wrote:
I think this also applies to network booting via gPXE. Usually our VMs have 4 NICs, mixed virtio-net and PCI pass-through. 2 of the NICs shall be used for booting, even if there are hard disks or floppy disks connected. This scenario is currently almost impossible to configure.
I already posted a patch 'new parameter boot=on|off for "-net nic" and "-device" NIC devices' which should solve that problem for us. The patch is still under discussion. Of course passing detailed boot device information to SeaBIOS would be the best solution.
Yeah, forgot to specify net booting specifically. Do we emulate usb storage? If yes it can be bootable too.
USB emultion is supported by qemu, and being able to specify a boot from USB drive should be supported IMO.
-Kevin
On Mon, Oct 11, 2010 at 12:16 PM, Bernhard Kohl bernhard.kohl@nsn.com wrote:
Am 11.10.2010 12:18, schrieb ext Gleb Natapov:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas?
I think this also applies to network booting via gPXE. Usually our VMs have 4 NICs, mixed virtio-net and PCI pass-through. 2 of the NICs shall be used for booting, even if there are hard disks or floppy disks connected. This scenario is currently almost impossible to configure.
Here is a gPXE to support fw_cfg. You can pass gPXE script files from the host to gPXE inside the guest. This means you can boot specific NICs: http://patchwork.ozlabs.org/patch/43777/
Just wanted to post the link because it is related to the gPXE side of this discussion.
Stefan
On Mon, Oct 11, 2010 at 01:48:09PM +0100, Stefan Hajnoczi wrote:
On Mon, Oct 11, 2010 at 12:16 PM, Bernhard Kohl bernhard.kohl@nsn.com wrote:
Am 11.10.2010 12:18, schrieb ext Gleb Natapov:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas?
I think this also applies to network booting via gPXE. Usually our VMs have 4 NICs, mixed virtio-net and PCI pass-through. 2 of the NICs shall be used for booting, even if there are hard disks or floppy disks connected. This scenario is currently almost impossible to configure.
Here is a gPXE to support fw_cfg. You can pass gPXE script files from the host to gPXE inside the guest. This means you can boot specific NICs: http://patchwork.ozlabs.org/patch/43777/
Just wanted to post the link because it is related to the gPXE side of this discussion.
Don't we load gPXE for each NIC and seabios passes PCI device to boot from when it invokes one of them?
-- Gleb.
2010/10/11 Gleb Natapov gleb@redhat.com:
On Mon, Oct 11, 2010 at 01:48:09PM +0100, Stefan Hajnoczi wrote:
On Mon, Oct 11, 2010 at 12:16 PM, Bernhard Kohl bernhard.kohl@nsn.com wrote:
Am 11.10.2010 12:18, schrieb ext Gleb Natapov:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas?
I think this also applies to network booting via gPXE. Usually our VMs have 4 NICs, mixed virtio-net and PCI pass-through. 2 of the NICs shall be used for booting, even if there are hard disks or floppy disks connected. This scenario is currently almost impossible to configure.
Here is a gPXE to support fw_cfg. You can pass gPXE script files from the host to gPXE inside the guest. This means you can boot specific NICs: http://patchwork.ozlabs.org/patch/43777/
Just wanted to post the link because it is related to the gPXE side of this discussion.
Don't we load gPXE for each NIC and seabios passes PCI device to boot from when it invokes one of them?
SeaBIOS may do that but gPXE internally just probes all PCI devices. It does not take advantage of the PCI bus/addr/fn that was passed to the option ROM. A gPXE instance will try booting from each available NIC in sequence.
Stefan
On Mon, Oct 11, 2010 at 04:52:31PM +0100, Stefan Hajnoczi wrote:
2010/10/11 Gleb Natapov gleb@redhat.com:
On Mon, Oct 11, 2010 at 01:48:09PM +0100, Stefan Hajnoczi wrote:
On Mon, Oct 11, 2010 at 12:16 PM, Bernhard Kohl bernhard.kohl@nsn.com wrote:
Am 11.10.2010 12:18, schrieb ext Gleb Natapov:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas?
I think this also applies to network booting via gPXE. Usually our VMs have 4 NICs, mixed virtio-net and PCI pass-through. 2 of the NICs shall be used for booting, even if there are hard disks or floppy disks connected. This scenario is currently almost impossible to configure.
Here is a gPXE to support fw_cfg. You can pass gPXE script files from the host to gPXE inside the guest. This means you can boot specific NICs: http://patchwork.ozlabs.org/patch/43777/
Just wanted to post the link because it is related to the gPXE side of this discussion.
Don't we load gPXE for each NIC and seabios passes PCI device to boot from when it invokes one of them?
SeaBIOS may do that but gPXE internally just probes all PCI devices. It does not take advantage of the PCI bus/addr/fn that was passed to the option ROM. A gPXE instance will try booting from each available NIC in sequence.
Ah, thanks for clarification. Looks like gPXE does the wrong thing here. Can this behaviour be changed by compile time option?
-- Gleb.
On 10/11/2010 10:52 AM, Stefan Hajnoczi wrote:
2010/10/11 Gleb Natapovgleb@redhat.com:
On Mon, Oct 11, 2010 at 01:48:09PM +0100, Stefan Hajnoczi wrote:
On Mon, Oct 11, 2010 at 12:16 PM, Bernhard Kohlbernhard.kohl@nsn.com wrote:
Am 11.10.2010 12:18, schrieb ext Gleb Natapov:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas?
I think this also applies to network booting via gPXE. Usually our VMs have 4 NICs, mixed virtio-net and PCI pass-through. 2 of the NICs shall be used for booting, even if there are hard disks or floppy disks connected. This scenario is currently almost impossible to configure.
Here is a gPXE to support fw_cfg. You can pass gPXE script files from the host to gPXE inside the guest. This means you can boot specific NICs: http://patchwork.ozlabs.org/patch/43777/
Just wanted to post the link because it is related to the gPXE side of this discussion.
Don't we load gPXE for each NIC and seabios passes PCI device to boot from when it invokes one of them?
SeaBIOS may do that but gPXE internally just probes all PCI devices. It does not take advantage of the PCI bus/addr/fn that was passed to the option ROM. A gPXE instance will try booting from each available NIC in sequence.
It still registers a BEV entry though, no?
Does it at least try to boot from the PCI bus/addr/fn of the selected BEV entry?
Regards,
Anthony Liguori
Stefan
On Mon, Oct 11, 2010 at 12:01:58PM -0500, Anthony Liguori wrote:
On 10/11/2010 10:52 AM, Stefan Hajnoczi wrote:
2010/10/11 Gleb Natapovgleb@redhat.com:
On Mon, Oct 11, 2010 at 01:48:09PM +0100, Stefan Hajnoczi wrote:
On Mon, Oct 11, 2010 at 12:16 PM, Bernhard Kohlbernhard.kohl@nsn.com wrote:
Am 11.10.2010 12:18, schrieb ext Gleb Natapov:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas?
I think this also applies to network booting via gPXE. Usually our VMs have 4 NICs, mixed virtio-net and PCI pass-through. 2 of the NICs shall be used for booting, even if there are hard disks or floppy disks connected. This scenario is currently almost impossible to configure.
Here is a gPXE to support fw_cfg. You can pass gPXE script files from the host to gPXE inside the guest. This means you can boot specific NICs: http://patchwork.ozlabs.org/patch/43777/
Just wanted to post the link because it is related to the gPXE side of this discussion.
Don't we load gPXE for each NIC and seabios passes PCI device to boot from when it invokes one of them?
SeaBIOS may do that but gPXE internally just probes all PCI devices. It does not take advantage of the PCI bus/addr/fn that was passed to the option ROM. A gPXE instance will try booting from each available NIC in sequence.
It still registers a BEV entry though, no?
Does it at least try to boot from the PCI bus/addr/fn of the selected BEV entry?
I think so. Kevin will know for sure.
-- Gleb.
On Mon, Oct 11, 2010 at 07:04:25PM +0200, Gleb Natapov wrote:
On Mon, Oct 11, 2010 at 12:01:58PM -0500, Anthony Liguori wrote:
On 10/11/2010 10:52 AM, Stefan Hajnoczi wrote:
SeaBIOS may do that but gPXE internally just probes all PCI devices. It does not take advantage of the PCI bus/addr/fn that was passed to the option ROM. A gPXE instance will try booting from each available NIC in sequence.
It still registers a BEV entry though, no?
Does it at least try to boot from the PCI bus/addr/fn of the selected BEV entry?
I think so. Kevin will know for sure.
SeaBIOS will register a BEV for each gPXE option rom it finds, and the user may choose which BEV to run first. I don't know if gPXE cares if different BEVs are run though.
-Kevin
On Mon, Oct 11, 2010 at 6:04 PM, Gleb Natapov gleb@redhat.com wrote:
On Mon, Oct 11, 2010 at 12:01:58PM -0500, Anthony Liguori wrote:
On 10/11/2010 10:52 AM, Stefan Hajnoczi wrote:
2010/10/11 Gleb Natapovgleb@redhat.com:
On Mon, Oct 11, 2010 at 01:48:09PM +0100, Stefan Hajnoczi wrote:
On Mon, Oct 11, 2010 at 12:16 PM, Bernhard Kohlbernhard.kohl@nsn.com wrote:
Am 11.10.2010 12:18, schrieb ext Gleb Natapov: >Currently if VM is started with multiple disks it is almost impossible to >guess which one of them will be used as boot device especially if there >is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order >and without looking into the code you can't tell what the order will >be (and in qemu-kvm if boot=on is used it brings even more havoc). We >should allow fine-grained control of boot order from qemu command line, >or as a minimum control what device will be used for booting. > >To do that along with inventing syntax to specify boot order on qemu >command line we need to communicate boot order to seabios via fw_cfg >interface. For that we need to have a way to unambiguously specify a >disk from qemu to seabios. PCI bus address is not enough since not all >devices are PCI (do we care about them?) and since one PCI device may >control more then one disk (ATA slave/master, SCSI LUNs). We can do what >EDD specification does. Describe disk as: > bus type (isa/pci), > address on a bus (16 bit base address for isa, b/s/f for pci) > device type (ATA/SCSI/VIRTIO) > device path (slave/master for ATA, LUN for SCSI, nothing for virtio) > >Will it cover all use cased? Any other ideas? I think this also applies to network booting via gPXE. Usually our VMs have 4 NICs, mixed virtio-net and PCI pass-through. 2 of the NICs shall be used for booting, even if there are hard disks or floppy disks connected. This scenario is currently almost impossible to configure.
Here is a gPXE to support fw_cfg. You can pass gPXE script files from the host to gPXE inside the guest. This means you can boot specific NICs: http://patchwork.ozlabs.org/patch/43777/
Just wanted to post the link because it is related to the gPXE side of this discussion.
Don't we load gPXE for each NIC and seabios passes PCI device to boot from when it invokes one of them?
SeaBIOS may do that but gPXE internally just probes all PCI devices. It does not take advantage of the PCI bus/addr/fn that was passed to the option ROM. A gPXE instance will try booting from each available NIC in sequence.
It still registers a BEV entry though, no?
Yes.
Does it at least try to boot from the PCI bus/addr/fn of the selected BEV entry?
Not directly. It probes all PCI devices and tries them in bus/addr/fn order. If you have two identical NICs and only have the boot ROM on the second NIC, the first NIC will still try to network boot first.
Changing this behavior requires stashing away the bus/addr/fn and then using it later in gPXE's startup. It's possible but not implemented today.
Stefan
On 10/11/2010 12:18 PM, Gleb Natapov wrote:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas? Any ideas about qemu command line syntax? May be somebody whats to implement it? :)
Instead of fwcfg, we should store the boot order in the bios. This allows seabios to implement persistent boot selection and control boot order from within the guest.
On Mon, Oct 11, 2010 at 05:09:22PM +0200, Avi Kivity wrote:
On 10/11/2010 12:18 PM, Gleb Natapov wrote:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas? Any ideas about qemu command line syntax? May be somebody whats to implement it? :)
Instead of fwcfg, we should store the boot order in the bios. This allows seabios to implement persistent boot selection and control boot order from within the guest.
It is not "instead of" it is in a best case "in addition too". First of all seabios does not have persistent storage currently and second I much prefer specifying boot device from command line instead of navigating bios menus. That what we have to do on real HW because there is not other way to do it, but in virtualization we can do better.
-- Gleb.
On 10/11/2010 05:39 PM, Gleb Natapov wrote:
On Mon, Oct 11, 2010 at 05:09:22PM +0200, Avi Kivity wrote:
On 10/11/2010 12:18 PM, Gleb Natapov wrote:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas? Any ideas about qemu command line syntax? May be somebody whats to implement it? :)
Instead of fwcfg, we should store the boot order in the bios. This allows seabios to implement persistent boot selection and control boot order from within the guest.
It is not "instead of" it is in a best case "in addition too". First of all seabios does not have persistent storage currently and second I much prefer specifying boot device from command line instead of navigating bios menus. That what we have to do on real HW because there is not other way to do it, but in virtualization we can do better.
Ok. So fwcfg will have an option "do your default thing" which the bios can take as a hint to look in cmos memory.
On Mon, Oct 11, 2010 at 05:42:30PM +0200, Avi Kivity wrote:
On 10/11/2010 05:39 PM, Gleb Natapov wrote:
On Mon, Oct 11, 2010 at 05:09:22PM +0200, Avi Kivity wrote:
On 10/11/2010 12:18 PM, Gleb Natapov wrote:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
Will it cover all use cased? Any other ideas? Any ideas about qemu command line syntax? May be somebody whats to implement it? :)
Instead of fwcfg, we should store the boot order in the bios. This allows seabios to implement persistent boot selection and control boot order from within the guest.
It is not "instead of" it is in a best case "in addition too". First of all seabios does not have persistent storage currently and second I much prefer specifying boot device from command line instead of navigating bios menus. That what we have to do on real HW because there is not other way to do it, but in virtualization we can do better.
Ok. So fwcfg will have an option "do your default thing" which the bios can take as a hint to look in cmos memory.
Definitely. If qemu does not provide any info about boot order default logic should be used.
-- Gleb.
On 10/11/2010 08:09 AM, Avi Kivity wrote:
Instead of fwcfg, we should store the boot order in the bios. This allows seabios to implement persistent boot selection and control boot order from within the guest.
Arguably, what is really needed is a well-defined CMOS or ESCD interface that is exported via some kind of data structure. This would allow either preboot modification, BIOS-handled modification or guest OS modification of the configuration.
-hpa
On Mon, Oct 11, 2010 at 12:18:55PM +0200, Gleb Natapov wrote:
Currently if VM is started with multiple disks it is almost impossible to guess which one of them will be used as boot device especially if there is a mix of ATA/virtio/SCSI devices. Essentially BIOS decides the order and without looking into the code you can't tell what the order will be (and in qemu-kvm if boot=on is used it brings even more havoc). We should allow fine-grained control of boot order from qemu command line, or as a minimum control what device will be used for booting.
To do that along with inventing syntax to specify boot order on qemu command line we need to communicate boot order to seabios via fw_cfg interface. For that we need to have a way to unambiguously specify a disk from qemu to seabios. PCI bus address is not enough since not all devices are PCI (do we care about them?) and since one PCI device may control more then one disk (ATA slave/master, SCSI LUNs). We can do what EDD specification does. Describe disk as: bus type (isa/pci), address on a bus (16 bit base address for isa, b/s/f for pci) device type (ATA/SCSI/VIRTIO) device path (slave/master for ATA, LUN for SCSI, nothing for virtio)
That makes sense to me.
We could update SeaBIOS to give a short unique name to every BEV and BCV it finds based on the path to the device. (For example, something like "ata@01:13.0@0", "usb@1234:5678", "virtio@01:13.0", "rom@01:13.0".) Then qemu could pass in (via fw_cfg) a list of names that the user wishes to boot from. SeaBIOS can then prioritizes those devices it finds that are also in the fw_cfg list.
Will it cover all use cased? Any other ideas? Any ideas about qemu command line syntax? May be somebody whats to implement it? :)
As for qemu command line - maybe just use the current ",boot=on" syntax, and have qemu map it into a "path name" for seabios?
-Kevin