I'm using SeaBIOS (b0d61ec) to boot from a virtual NVMe controller that has relatively many namespaces (64). The exact error is :
/3ff9f000\ Start thread |3ff9f000| Searching bootorder for: /pci@i0cf8/*@6 ... |3ff9f000| WARNING - Unable to allocate resource at nvme_controller_enable:632!
I changed the number of namespaces my controller reports to 1 and it worked fine. Is there an easy way to get around this or do I have to fix the code? I haven't looked at the code in detail, but I think we don't have to allocate the array of namespaces in nvme_controller_enable; instead, we can probe a namespace right before we attempt to boot from it (not sure where exactly this is done).
-----Original Message----- From: Thanos Makatos thanos.makatos@nutanix.com Sent: 01 April 2021 13:42 To: seabios@seabios.org Cc: John Levon john.levon@nutanix.com; Swapnil Ingle swapnil.ingle@nutanix.com; Liu, Changpeng changpeng.liu@intel.com Subject: [SeaBIOS] SeaBIOS fails to boot from NVMe controller with lots of namespaces
I'm using SeaBIOS (b0d61ec) to boot from a virtual NVMe controller that has relatively many namespaces (64). The exact error is :
/3ff9f000\ Start thread |3ff9f000| Searching bootorder for: /pci@i0cf8/*@6 ... |3ff9f000| WARNING - Unable to allocate resource at nvme_controller_enable:632!
I changed the number of namespaces my controller reports to 1 and it worked fine. Is there an easy way to get around this or do I have to fix the code? I haven't looked at the code in detail, but I think we don't have to allocate the array of namespaces in nvme_controller_enable; instead, we can probe a namespace right before we attempt to boot from it (not sure where exactly this is done). _______________________________________________ SeaBIOS mailing list -- seabios@seabios.org To unsubscribe send an email to seabios-leave@seabios.org
I figured out that by increasing BUILD_MIN_BIOSTABLE from 2K to 16K there's enough memory to dynamically allocate. Is there any problem with this?
On Thu, Apr 01, 2021 at 12:42:18PM +0000, Thanos Makatos wrote:
I'm using SeaBIOS (b0d61ec) to boot from a virtual NVMe controller that has relatively many namespaces (64). The exact error is :
/3ff9f000\ Start thread |3ff9f000| Searching bootorder for: /pci@i0cf8/*@6 ... |3ff9f000| WARNING - Unable to allocate resource at nvme_controller_enable:632!
I changed the number of namespaces my controller reports to 1 and it worked fine. Is there an easy way to get around this or do I have to fix the code? I haven't looked at the code in detail, but I think we don't have to allocate the array of namespaces in nvme_controller_enable; instead, we can probe a namespace right before we attempt to boot from it (not sure where exactly this is done).
Well, you can try skip non-bootable namespaces and use "qemu -boot strict=on". It happens on nvme controller level already (see nvme_controller_setup()).
Easy way out without actual code changes would be to use two nvme controllers, one for the boot disk, one for all others, set bootindex for the boot disk only (and use strict=on of course). seabios should completely ignore the second nvme controller then.
HTH, Gerd
-----Original Message----- From: Gerd Hoffmann kraxel@redhat.com Sent: 08 April 2021 11:10 To: Thanos Makatos thanos.makatos@nutanix.com Cc: seabios@seabios.org; John Levon john.levon@nutanix.com; Swapnil Ingle swapnil.ingle@nutanix.com; Liu, Changpeng changpeng.liu@intel.com Subject: Re: [SeaBIOS] SeaBIOS fails to boot from NVMe controller with lots of namespaces
On Thu, Apr 01, 2021 at 12:42:18PM +0000, Thanos Makatos wrote:
I'm using SeaBIOS (b0d61ec) to boot from a virtual NVMe controller that
has relatively many namespaces (64). The exact error is :
/3ff9f000\ Start thread |3ff9f000| Searching bootorder for: /pci@i0cf8/*@6 ... |3ff9f000| WARNING - Unable to allocate resource at
nvme_controller_enable:632!
I changed the number of namespaces my controller reports to 1 and it worked fine. Is there an easy way to get around this or do I have to fix the code? I haven't looked at the code in detail, but I think we don't have to allocate the array of namespaces in nvme_controller_enable; instead, we can probe a namespace right before we attempt to boot from it (not sure where exactly this is done).
Well, you can try skip non-bootable namespaces and use "qemu -boot strict=on". It happens on nvme controller level already (see nvme_controller_setup()).
AFAIK this applies to the entire controller, not individual namespaces.
Easy way out without actual code changes would be to use two nvme controllers, one for the boot disk, one for all others, set bootindex for the boot disk only (and use strict=on of course). seabios should completely ignore the second nvme controller then.
That's not option for me as this will be a customer VM so we don't know on which NS the OS will be installed.
In another email I said that increasing BUILD_MIN_BIOSTABLE by 8x solves the problem, is there a problem with this solution?
Hi,
I changed the number of namespaces my controller reports to 1 and it worked fine. Is there an easy way to get around this or do I have to fix the code? I haven't looked at the code in detail, but I think we don't have to allocate the array of namespaces in nvme_controller_enable; instead, we can probe a namespace right before we attempt to boot from it (not sure where exactly this is done).
Well, you can try skip non-bootable namespaces and use "qemu -boot strict=on". It happens on nvme controller level already (see nvme_controller_setup()).
AFAIK this applies to the entire controller, not individual namespaces.
Current code, yes, but you can change the driver to do the same on namespace level (simliar to how virtio-scsi skips non-bootable disks) ...
Easy way out without actual code changes would be to use two nvme controllers, one for the boot disk, one for all others, set bootindex for the boot disk only (and use strict=on of course). seabios should completely ignore the second nvme controller then.
That's not option for me as this will be a customer VM so we don't know on which NS the OS will be installed.
... except that it doesn't help much if you don't know which NS the OS is installed on.
In another email I said that increasing BUILD_MIN_BIOSTABLE by 8x solves the problem, is there a problem with this solution?
redhat increases BUILD_MIN_BIOSTABLE too (4x only though).
I think this was discussed before but google doesn't find me anything so not sure why BUILD_MIN_BIOSTABLE hasn't been increased upstream. Maybe it doesn't work with some configurations due to running out of address space. Should that be the case a config option could be a way out.
Kevin?
take care, Gerd
On Thu, Apr 08, 2021 at 01:32:47PM +0200, Gerd Hoffmann wrote:
I changed the number of namespaces my controller reports to 1 and it worked fine. Is there an easy way to get around this or do I have to fix the code? I haven't looked at the code in detail, but I think we don't have to allocate the array of namespaces in nvme_controller_enable; instead, we can probe a namespace right before we attempt to boot from it (not sure where exactly this is done).
Well, you can try skip non-bootable namespaces and use "qemu -boot strict=on". It happens on nvme controller level already (see nvme_controller_setup()).
AFAIK this applies to the entire controller, not individual namespaces.
Current code, yes, but you can change the driver to do the same on namespace level (simliar to how virtio-scsi skips non-bootable disks) ...
Easy way out without actual code changes would be to use two nvme controllers, one for the boot disk, one for all others, set bootindex for the boot disk only (and use strict=on of course). seabios should completely ignore the second nvme controller then.
That's not option for me as this will be a customer VM so we don't know on which NS the OS will be installed.
... except that it doesn't help much if you don't know which NS the OS is installed on.
In another email I said that increasing BUILD_MIN_BIOSTABLE by 8x solves the problem, is there a problem with this solution?
redhat increases BUILD_MIN_BIOSTABLE too (4x only though).
I think this was discussed before but google doesn't find me anything so not sure why BUILD_MIN_BIOSTABLE hasn't been increased upstream. Maybe it doesn't work with some configurations due to running out of address space. Should that be the case a config option could be a way out.
I don't recall discussing BUILD_MIN_BIOSTABLE and there isn't history of it on the mailing list.
I get the following on the current build:
Fixed space: 0xe05b-0x10000 total: 8101 slack: 10 Percent slack: 0.1% 16bit size: 38048 32bit segmented size: 2292 32bit flat size: 46540 32bit flat init size: 84576 Lowmem size: 2240 f-segment var size: 1248 ... Total size: 181248 Fixed: 88128 Free: 80896 (used 69.1% of 256KiB rom)
The main constraint is that the "16bit", "32bit segmented", "f-segment var", and biostable sections must fit in the f-segment (~42K today). The secondary constraint is on option rom space - the above plus "32bit flat", "lowmem", any dynamic low memory allocations, and option roms must all fit in 256KiB (starting at ~90K today). The third constraint is that the image as a whole (including biostable space) must fit in 256KiB (~181K today). Finally, there is the "corner case" where if someone does not select RELOCATE_INIT, then everything (including dynamic memory and option roms) must fit in 256KiB.
So, it does seem like we have space available to increase BUILD_MIN_BIOSTABLE. The challenge is mostly on managing risks wrt to other failure cases.
Is QEMU still shipping images limited to 128KiB? If so, that would be another restriction.
-Kevin
Hi,
Is QEMU still shipping images limited to 128KiB? If so, that would be another restriction.
Ah, right, maybe that was the reason.
Yes, there still is a 128k variant, for backward compatibility with qemu version 1.7 & older.
Backward compatibility support for qemu version 1.3 & older has been removed already, so it could very well be that this will be deprecated and removed too soonish, but for now it is needed still.
I guess that implies a config option is a good idea, so we can use different sizes for the 128k and the 256k version.
take care, Gerd
-----Original Message----- From: Kevin O'Connor kevin@koconnor.net Sent: 14 April 2021 16:09 To: Gerd Hoffmann kraxel@redhat.com Cc: Thanos Makatos thanos.makatos@nutanix.com; seabios@seabios.org; John Levon john.levon@nutanix.com; Swapnil Ingle swapnil.ingle@nutanix.com; Liu, Changpeng changpeng.liu@intel.com Subject: Re: [SeaBIOS] SeaBIOS fails to boot from NVMe controller with lots of namespaces
On Thu, Apr 08, 2021 at 01:32:47PM +0200, Gerd Hoffmann wrote:
I changed the number of namespaces my controller reports to 1 and it worked fine. Is there an easy way to get around this or do I have to fix the code? I haven't looked at the code in detail, but I think we don't have to allocate the array of namespaces in nvme_controller_enable; instead, we can probe a namespace right before we attempt to boot from it (not sure where
exactly this is done).
Well, you can try skip non-bootable namespaces and use "qemu -boot strict=on". It happens on nvme controller level already (see nvme_controller_setup()).
AFAIK this applies to the entire controller, not individual namespaces.
Current code, yes, but you can change the driver to do the same on namespace level (simliar to how virtio-scsi skips non-bootable disks) ...
Easy way out without actual code changes would be to use two nvme controllers, one for the boot disk, one for all others, set bootindex for the boot disk only (and use strict=on of course). seabios should completely ignore the second nvme controller then.
That's not option for me as this will be a customer VM so we don't know
on which NS the OS will be installed.
... except that it doesn't help much if you don't know which NS the OS is installed on.
In another email I said that increasing BUILD_MIN_BIOSTABLE by 8x
solves the problem, is there a problem with this solution?
redhat increases BUILD_MIN_BIOSTABLE too (4x only though).
I think this was discussed before but google doesn't find me anything so not sure why BUILD_MIN_BIOSTABLE hasn't been increased upstream. Maybe it doesn't work with some configurations due to running out of address space. Should that be the case a config option could be a way out.
I don't recall discussing BUILD_MIN_BIOSTABLE and there isn't history of it on the mailing list.
I get the following on the current build:
Fixed space: 0xe05b-0x10000 total: 8101 slack: 10 Percent slack: 0.1% 16bit size: 38048 32bit segmented size: 2292 32bit flat size: 46540 32bit flat init size: 84576 Lowmem size: 2240 f-segment var size: 1248 ... Total size: 181248 Fixed: 88128 Free: 80896 (used 69.1% of 256KiB rom)
The main constraint is that the "16bit", "32bit segmented", "f-segment var", and biostable sections must fit in the f-segment (~42K today). The secondary constraint is on option rom space - the above plus "32bit flat", "lowmem", any dynamic low memory allocations, and option roms must all fit in 256KiB (starting at ~90K today). The third constraint is that the image as a whole (including biostable space) must fit in 256KiB (~181K today). Finally, there is the "corner case" where if someone does not select RELOCATE_INIT, then everything (including dynamic memory and option roms) must fit in 256KiB.
So, it does seem like we have space available to increase BUILD_MIN_BIOSTABLE. The challenge is mostly on managing risks wrt to other failure cases.
Regarding the failure cases, will things break during build (BUILD_MIN_BIOSTABLE=16K), e.g:
[seabios] Error! ROM doesn't fit (135584 > 131072) [seabios] You have to either increase the size (CONFIG_ROM_SIZE) [seabios] or turn off some features (such as hardware support not [seabios] needed) to make it fit. Trying a more recent gcc version [seabios] might work too. [seabios] make: *** [out/bios.bin.prep] Error 1
Or do we expect undefined behavior at run time?
Also, according to the NVMe spec there can be 2^32 namespaces, which is a lot. I did the following tests:
BUILD_MIN_BIOSTABLE namespaces works? 16K 128 yes 16K 256 no 32K 256 yes 32K 512 no 64K 512 guest goes into reset loop
256 namespaces is not an insanely huge number.
Hi,
Regarding the failure cases, will things break during build (BUILD_MIN_BIOSTABLE=16K), e.g:
[seabios] Error! ROM doesn't fit (135584 > 131072) [seabios] You have to either increase the size (CONFIG_ROM_SIZE) [seabios] or turn off some features (such as hardware support not [seabios] needed) to make it fit. Trying a more recent gcc version [seabios] might work too. [seabios] make: *** [out/bios.bin.prep] Error 1
Or do we expect undefined behavior at run time?
Running out of memory can lead to undefined behavior at run time, depending on which allocations fail. Typical error pattern is that seabios can't initialize all devices, leading to boot failures.
256 namespaces is not an insanely huge number.
Well, back in the 80ies when the BIOS interfaces have been created 256 was an insanely huge number ...
Given we have a number of real mode constrains for compatibility reasons (like some data structures must live in the f segment) there is no easy way out, we simply can't support an unlimited number of disks (which btw is one of the reasons why the "skip non-bootable disks" code exists).
Of course there is the option to leave behind the 80ies and go for UEFI.
take care, Gerd
-----Original Message----- From: Gerd Hoffmann kraxel@redhat.com Sent: 16 April 2021 08:15 To: Thanos Makatos thanos.makatos@nutanix.com Cc: Kevin O'Connor kevin@koconnor.net; seabios@seabios.org; John Levon john.levon@nutanix.com; Swapnil Ingle swapnil.ingle@nutanix.com; Liu, Changpeng changpeng.liu@intel.com Subject: Re: [SeaBIOS] SeaBIOS fails to boot from NVMe controller with lots of namespaces
Hi,
Regarding the failure cases, will things break during build
(BUILD_MIN_BIOSTABLE=16K), e.g:
[seabios] Error! ROM doesn't fit (135584 > 131072) [seabios] You have to either increase the size (CONFIG_ROM_SIZE) [seabios] or turn off some features (such as hardware support not [seabios] needed) to make it fit. Trying a more recent gcc version [seabios] might work too. [seabios] make: *** [out/bios.bin.prep] Error 1
Or do we expect undefined behavior at run time?
Running out of memory can lead to undefined behavior at run time, depending on which allocations fail. Typical error pattern is that seabios can't initialize all devices, leading to boot failures.
256 namespaces is not an insanely huge number.
Well, back in the 80ies when the BIOS interfaces have been created 256 was an insanely huge number ...
Given we have a number of real mode constrains for compatibility reasons (like some data structures must live in the f segment) there is no easy way out, we simply can't support an unlimited number of disks (which btw is one of the reasons why the "skip non-bootable disks" code exists).
SeaBIOS only tries to boot from the first namespace; if that fails other namespaces aren't tried (https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/thread/72LFLT7...). If we stick to the current behavior, then there's no reason to probe any other namespace apart from the first one. This greatly reduces memory requirements so we should be able to support the max number of namespaces.
Of course there is the option to leave behind the 80ies and go for UEFI.
Or can just live with it as you suggest and simply document it 😊.
On 4/16/21 4:46 AM, Thanos Makatos wrote:
-----Original Message----- From: Gerd Hoffmann kraxel@redhat.com Sent: 16 April 2021 08:15 To: Thanos Makatos thanos.makatos@nutanix.com Cc: Kevin O'Connor kevin@koconnor.net; seabios@seabios.org; John Levon john.levon@nutanix.com; Swapnil Ingle swapnil.ingle@nutanix.com; Liu, Changpeng changpeng.liu@intel.com Subject: Re: [SeaBIOS] SeaBIOS fails to boot from NVMe controller with lots of namespaces
Hi,
Regarding the failure cases, will things break during build
(BUILD_MIN_BIOSTABLE=16K), e.g:
[seabios] Error! ROM doesn't fit (135584 > 131072) [seabios] You have to either increase the size (CONFIG_ROM_SIZE) [seabios] or turn off some features (such as hardware support not [seabios] needed) to make it fit. Trying a more recent gcc version [seabios] might work too. [seabios] make: *** [out/bios.bin.prep] Error 1
Or do we expect undefined behavior at run time?
Running out of memory can lead to undefined behavior at run time, depending on which allocations fail. Typical error pattern is that seabios can't initialize all devices, leading to boot failures.
256 namespaces is not an insanely huge number.
Well, back in the 80ies when the BIOS interfaces have been created 256 was an insanely huge number ...
Given we have a number of real mode constrains for compatibility reasons (like some data structures must live in the f segment) there is no easy way out, we simply can't support an unlimited number of disks (which btw is one of the reasons why the "skip non-bootable disks" code exists).
SeaBIOS only tries to boot from the first namespace; if that fails other namespaces aren't tried (https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/thread/72LFLT7...). If we stick to the current behavior, then there's no reason to probe any other namespace apart from the first one. This greatly reduces memory requirements so we should be able to support the max number of namespaces.
Of course there is the option to leave behind the 80ies and go for UEFI.
Or can just live with it as you suggest and simply document it 😊.
Sorry for thread necro, but CoreOS devs are hitting this issue too, but using plain emulated qemu nvme. Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=1963255
Originally a `-device nvme,...` would report only 1 namespace, and seabios could boot from it fine. After '7f0f1acedf hw/block/nvme: support multiple namespaces', the device now reports 256 namespaces, and seabios fails like described in this thread.
Where can I file this so it doesn't get lost? Anyone know a qemu command line workaround?
Thanks, Cole
Hi,
Where can I file this so it doesn't get lost? Anyone know a qemu command line workaround?
Does https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/thread/2Q7NPH7... help?
take care, Gerd
On 5/25/21 9:23 AM, Gerd Hoffmann wrote:
Hi,
Where can I file this so it doesn't get lost? Anyone know a qemu command line workaround?
Does https://mail.coreboot.org/hyperkitty/list/seabios@seabios.org/thread/2Q7NPH7... help?
No, tested with qemu.git. Plus the code the patch touches in nvme_controller_enable comes after where the alloc failure happens.
- Cole
-----Original Message----- From: Cole Robinson crobinso@redhat.com Sent: 25 May 2021 17:01 To: Gerd Hoffmann kraxel@redhat.com Cc: Thanos Makatos thanos.makatos@nutanix.com; seabios@seabios.org; John Levon john.levon@nutanix.com; Swapnil Ingle swapnil.ingle@nutanix.com; Liu, Changpeng changpeng.liu@intel.com Subject: Re: [SeaBIOS] Re: SeaBIOS fails to boot from NVMe controller with lots of namespaces
On 5/25/21 9:23 AM, Gerd Hoffmann wrote:
Hi,
Where can I file this so it doesn't get lost? Anyone know a qemu command line workaround?
Does https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.coreboot.org _hyperkitty_list_seabios-
40seabios.org_thread_2Q7NPH7TJNHK6JGPHQL7755H
ILO23ISN_&d=DwICAg&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6 ogtti46a
tk736SI4vgsJiUKIyDE&m=mdjJ1-
1RXDTbTR7f6XC83pvNxgTqd_Yz_1nmiNOsSqk&s=4E
z7Bvnncx71uPWLUTdweByMiN5ypl3-FzhmrDT2ZcY&e= help?
No, tested with qemu.git. Plus the code the patch touches in nvme_controller_enable comes after where the alloc failure happens.
I haven't upstreamed the relevant QEMU patch yet as I'm waiting for the SeaBIOS part to be reviewed first.
The alloc failure can be avoided be setting BUILD_MIN_BIOSTABLE to 32K.
- Cole
On 5/25/21 12:06 PM, Thanos Makatos wrote:
-----Original Message----- From: Cole Robinson crobinso@redhat.com Sent: 25 May 2021 17:01 To: Gerd Hoffmann kraxel@redhat.com Cc: Thanos Makatos thanos.makatos@nutanix.com; seabios@seabios.org; John Levon john.levon@nutanix.com; Swapnil Ingle swapnil.ingle@nutanix.com; Liu, Changpeng changpeng.liu@intel.com Subject: Re: [SeaBIOS] Re: SeaBIOS fails to boot from NVMe controller with lots of namespaces
On 5/25/21 9:23 AM, Gerd Hoffmann wrote:
Hi,
Where can I file this so it doesn't get lost? Anyone know a qemu command line workaround?
Does https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.coreboot.org _hyperkitty_list_seabios-
40seabios.org_thread_2Q7NPH7TJNHK6JGPHQL7755H
ILO23ISN_&d=DwICAg&c=s883GpUCOChKOHiocYtGcg&r=XTpYsh5Ps2zJvtw6 ogtti46a
tk736SI4vgsJiUKIyDE&m=mdjJ1-
1RXDTbTR7f6XC83pvNxgTqd_Yz_1nmiNOsSqk&s=4E
z7Bvnncx71uPWLUTdweByMiN5ypl3-FzhmrDT2ZcY&e= help?
No, tested with qemu.git. Plus the code the patch touches in nvme_controller_enable comes after where the alloc failure happens.
I haven't upstreamed the relevant QEMU patch yet as I'm waiting for the SeaBIOS part to be reviewed first.
The alloc failure can be avoided be setting BUILD_MIN_BIOSTABLE to 32K.
Okay I just tried that, on top of master. It's still not enough for qemu's 256 namespace default, same reported error. If I override it with `identify->nn = 128;` in nvme_controller_enable then things start working.
Thanks, Cole
Hi,
No, tested with qemu.git. Plus the code the patch touches in nvme_controller_enable comes after where the alloc failure happens.
I haven't upstreamed the relevant QEMU patch yet as I'm waiting for the SeaBIOS part to be reviewed first.
The alloc failure can be avoided be setting BUILD_MIN_BIOSTABLE to 32K.
Okay I just tried that, on top of master. It's still not enough for qemu's 256 namespace default, same reported error. If I override it with `identify->nn = 128;` in nvme_controller_enable then things start working.
Oh well. nvme goes allocate 256 nvme_namespace structs, even if 255 of them are not active. We can certainly do better than that, patch will follow shortly ...
take care, Gerd
Hi,
No, tested with qemu.git. Plus the code the patch touches in nvme_controller_enable comes after where the alloc failure happens.
I haven't upstreamed the relevant QEMU patch yet as I'm waiting for the SeaBIOS part to be reviewed first.
Just send it to qemu-devel. The usual process is that seabios adapts to qemu changes not the other way around.
take care, Gerd