On 12/17/2011 09:25 AM, Richard W.M. Jones wrote:
On Sat, Dec 17, 2011 at 09:22:45AM -0600, Anthony Liguori wrote:
I've even further narrowed it down to the presents or lack of '-vga cirrus'. If you add '-vga cirrus' to the above command line, the guest will boot successfully.
Confirmed: Adding -vga cirrus to the command line cures it too.
That's a strange one :-)
vga sticks out a bit because it's one of the few places where we treat device memory as ram as a performance optimization.
The only time vga has been touched in between v0.15 and v1.0 was during the introduction of the memory API.
It's this commit:
commit d67c3f2cd92aed2247bfa8a9da61a902b7b2ff09 Author: Gerd Hoffmann kraxel@redhat.com Date: Wed Aug 10 17:34:13 2011 +0200
seabios: update to master
commit 8e301472e324b6d6496d8b4ffc66863e99d7a505
user visible changes in seabios: * ahci is enabled by default (and thus in this build). * bootorder support for ahci. * two-pass pci allocator (orders bars by size for better packing).
Signed-off-by: Gerd Hoffmann kraxel@redhat.com
:040000 040000 76eb0c81b76563b55cb2bb5c484ccd48b8cfcded 5ec0d65d3a763a5566fe1f4c86269cad6d671020 M pc-bios :040000 040000 a5a7ea6e297c1e7490b0a2c28a06ce56e5be9449 78adb664d3ea82f1a4dd5ec239887ac5b0168a7f M roms
It can be reproduced by using virtio and -vga none with a number of PCI devices. The line below is what I used to bisect and reproduce 100% of the time. It's a 64-bit Fedora 15 guest.
$ qemu-system-x86_64 -drive file=/home/anthony/images/fedora.img,if=none,snapshot=on,id=hd0 -device virtio-balloon-pci,addr=03.0 -device virtio-blk-pci,addr=04.0,drive=hd0 -kernel ~/vmlinuz-2.6.38.6-26.rc1.fc15.x86_64 -initrd ~/initramfs-2.6.38.6-26.rc1.fc15.x86_64.img -append "root=/dev/mapper/VolGroup-lv_root rd_LVM_LV=VolGroup/lv_root rd_LVM_LV=VolGroup/lv_swap ro console=ttyS0 selinux=0" -nographic -nodefconfig -m 1G -no-reboot -no-hpet -device virtio-serial -chardev socket,path=/tmp/foo.sock,id=channel0,server,nowait -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 -nodefaults -serial stdio -enable-kvm
My guess it that it has something to do with the changes to the PCI allocator. I've confirmed reverting this commit fixes the problem.
Regards,
Anthony Liguori
Rich.
On Sat, Dec 17, 2011 at 10:24:07AM -0600, Anthony Liguori wrote:
On 12/17/2011 09:25 AM, Richard W.M. Jones wrote:
On Sat, Dec 17, 2011 at 09:22:45AM -0600, Anthony Liguori wrote:
I've even further narrowed it down to the presents or lack of '-vga cirrus'. If you add '-vga cirrus' to the above command line, the guest will boot successfully.
Confirmed: Adding -vga cirrus to the command line cures it too.
That's a strange one :-)
vga sticks out a bit because it's one of the few places where we treat device memory as ram as a performance optimization.
The only time vga has been touched in between v0.15 and v1.0 was during the introduction of the memory API.
It's this commit:
commit d67c3f2cd92aed2247bfa8a9da61a902b7b2ff09 Author: Gerd Hoffmann kraxel@redhat.com Date: Wed Aug 10 17:34:13 2011 +0200
seabios: update to master commit 8e301472e324b6d6496d8b4ffc66863e99d7a505 user visible changes in seabios: * ahci is enabled by default (and thus in this build). * bootorder support for ahci. * two-pass pci allocator (orders bars by size for better packing). Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
:040000 040000 76eb0c81b76563b55cb2bb5c484ccd48b8cfcded 5ec0d65d3a763a5566fe1f4c86269cad6d671020 M pc-bios :040000 040000 a5a7ea6e297c1e7490b0a2c28a06ce56e5be9449 78adb664d3ea82f1a4dd5ec239887ac5b0168a7f M roms
It can be reproduced by using virtio and -vga none with a number of PCI devices. The line below is what I used to bisect and reproduce 100% of the time. It's a 64-bit Fedora 15 guest.
$ qemu-system-x86_64 -drive file=/home/anthony/images/fedora.img,if=none,snapshot=on,id=hd0 -device virtio-balloon-pci,addr=03.0 -device virtio-blk-pci,addr=04.0,drive=hd0 -kernel ~/vmlinuz-2.6.38.6-26.rc1.fc15.x86_64 -initrd ~/initramfs-2.6.38.6-26.rc1.fc15.x86_64.img -append "root=/dev/mapper/VolGroup-lv_root rd_LVM_LV=VolGroup/lv_root rd_LVM_LV=VolGroup/lv_swap ro console=ttyS0 selinux=0" -nographic -nodefconfig -m 1G -no-reboot -no-hpet -device virtio-serial -chardev socket,path=/tmp/foo.sock,id=channel0,server,nowait -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 -nodefaults -serial stdio -enable-kvm
My guess it that it has something to do with the changes to the PCI allocator. I've confirmed reverting this commit fixes the problem.
Confirmed: reverting this patch fixes it for me.
Note that we suspected this patch before, way back in September: https://lists.gnu.org/archive/html/qemu-devel/2011-09/msg03830.html
Rich.
On Sat, Dec 17, 2011 at 10:24:07AM -0600, Anthony Liguori wrote:
On 12/17/2011 09:25 AM, Richard W.M. Jones wrote:
On Sat, Dec 17, 2011 at 09:22:45AM -0600, Anthony Liguori wrote:
I've even further narrowed it down to the presents or lack of '-vga cirrus'. If you add '-vga cirrus' to the above command line, the guest will boot successfully.
Confirmed: Adding -vga cirrus to the command line cures it too.
That's a strange one :-)
vga sticks out a bit because it's one of the few places where we treat device memory as ram as a performance optimization.
The only time vga has been touched in between v0.15 and v1.0 was during the introduction of the memory API.
It's this commit:
commit d67c3f2cd92aed2247bfa8a9da61a902b7b2ff09 Author: Gerd Hoffmann kraxel@redhat.com Date: Wed Aug 10 17:34:13 2011 +0200
seabios: update to master
This looks like the same issue reported at:
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00029.html
The SeaBIOS fix for this was in rel-1.6.3.1 - but that didn't make QEmu 1.0. Does the problem go away if you upgrade to the newer SeaBIOS version?
-Kevin
On 12/17/2011 10:49 AM, Kevin O'Connor wrote:
On Sat, Dec 17, 2011 at 10:24:07AM -0600, Anthony Liguori wrote:
On 12/17/2011 09:25 AM, Richard W.M. Jones wrote:
On Sat, Dec 17, 2011 at 09:22:45AM -0600, Anthony Liguori wrote:
I've even further narrowed it down to the presents or lack of '-vga cirrus'. If you add '-vga cirrus' to the above command line, the guest will boot successfully.
Confirmed: Adding -vga cirrus to the command line cures it too.
That's a strange one :-)
vga sticks out a bit because it's one of the few places where we treat device memory as ram as a performance optimization.
The only time vga has been touched in between v0.15 and v1.0 was during the introduction of the memory API.
It's this commit:
commit d67c3f2cd92aed2247bfa8a9da61a902b7b2ff09 Author: Gerd Hoffmannkraxel@redhat.com Date: Wed Aug 10 17:34:13 2011 +0200
seabios: update to master
This looks like the same issue reported at:
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00029.html
The SeaBIOS fix for this was in rel-1.6.3.1 - but that didn't make QEmu 1.0. Does the problem go away if you upgrade to the newer SeaBIOS version?
Er, I can't actually build SeaBIOS anymore...
The version of LD on this system does not properly handle alignments. As a result, this project can not be built.
The problem may be the result of this LD bug report: http://sourceware.org/bugzilla/show_bug.cgi?id=12726
Please update to a working version of binutils and retry. Makefile:75: *** "Please upgrade GCC and/or binutils". Stop.
Let me find a box with a newer binutils...
Regards,
Anthony Liguori
-Kevin
On Sat, Dec 17, 2011 at 11:49:56AM -0500, Kevin O'Connor wrote:
On Sat, Dec 17, 2011 at 10:24:07AM -0600, Anthony Liguori wrote:
On 12/17/2011 09:25 AM, Richard W.M. Jones wrote:
On Sat, Dec 17, 2011 at 09:22:45AM -0600, Anthony Liguori wrote:
I've even further narrowed it down to the presents or lack of '-vga cirrus'. If you add '-vga cirrus' to the above command line, the guest will boot successfully.
Confirmed: Adding -vga cirrus to the command line cures it too.
That's a strange one :-)
vga sticks out a bit because it's one of the few places where we treat device memory as ram as a performance optimization.
The only time vga has been touched in between v0.15 and v1.0 was during the introduction of the memory API.
It's this commit:
commit d67c3f2cd92aed2247bfa8a9da61a902b7b2ff09 Author: Gerd Hoffmann kraxel@redhat.com Date: Wed Aug 10 17:34:13 2011 +0200
seabios: update to master
This looks like the same issue reported at:
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00029.html
The SeaBIOS fix for this was in rel-1.6.3.1 - but that didn't make QEmu 1.0. Does the problem go away if you upgrade to the newer SeaBIOS version?
Yes, SeaBIOS upstream + qemu 1.0 does fix the problem.
Rich.
On Sat, Dec 17, 2011 at 11:49:56AM -0500, Kevin O'Connor wrote:
On Sat, Dec 17, 2011 at 10:24:07AM -0600, Anthony Liguori wrote:
On 12/17/2011 09:25 AM, Richard W.M. Jones wrote:
On Sat, Dec 17, 2011 at 09:22:45AM -0600, Anthony Liguori wrote:
I've even further narrowed it down to the presents or lack of '-vga cirrus'. If you add '-vga cirrus' to the above command line, the guest will boot successfully.
Confirmed: Adding -vga cirrus to the command line cures it too.
That's a strange one :-)
vga sticks out a bit because it's one of the few places where we treat device memory as ram as a performance optimization.
The only time vga has been touched in between v0.15 and v1.0 was during the introduction of the memory API.
It's this commit:
commit d67c3f2cd92aed2247bfa8a9da61a902b7b2ff09 Author: Gerd Hoffmann kraxel@redhat.com Date: Wed Aug 10 17:34:13 2011 +0200
seabios: update to master
This looks like the same issue reported at:
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00029.html
The SeaBIOS fix for this was in rel-1.6.3.1 - but that didn't make QEmu 1.0. Does the problem go away if you upgrade to the newer SeaBIOS version?
Sigh, we really need to be better about updating SeaBIOS in QEMU before release. We had plenty of time to pull in a newer SeaBIOS before 1.0 that would have fixed this :-( We've had multiple releases now where functionality is broken due to QEMU shipping with an older SeaBIOS release than is available upstream.
Regards, Daniel
On Mon, Dec 19, 2011 at 11:31 AM, Daniel P. Berrange berrange@redhat.com wrote:
Sigh, we really need to be better about updating SeaBIOS in QEMU before release. We had plenty of time to pull in a newer SeaBIOS before 1.0 that would have fixed this :-( We've had multiple releases now where functionality is broken due to QEMU shipping with an older SeaBIOS release than is available upstream.
Regards, Daniel
Yes.
SeaBIOS announcements should be made on the QEMU, KVM and Bochs mailing lists to raise awareness.
Before the merge window of QEMU (and other software that bundle SeaBIOS) close, SeaBIOS developers or contributors should verify if the latest version of SeaBIOS is shipped, if not then request or commit the latest SeaBIOS to be integrated.
On Mon, Dec 19, 2011 at 11:31 AM, Daniel P. Berrange berrange@redhat.com wrote:
Sigh, we really need to be better about updating SeaBIOS in QEMU before release. We had plenty of time to pull in a newer SeaBIOS before 1.0 that would have fixed this :-( We've had multiple releases now where functionality is broken due to QEMU shipping with an older SeaBIOS release than is available upstream.
Regards, Daniel
New releases of SeaBIOS needs to be more prominently announced and displayed on the front page of the SeaBIOS website.
On 12/19/2011 04:31 AM, Daniel P. Berrange wrote:
On Sat, Dec 17, 2011 at 11:49:56AM -0500, Kevin O'Connor wrote:
On Sat, Dec 17, 2011 at 10:24:07AM -0600, Anthony Liguori wrote:
On 12/17/2011 09:25 AM, Richard W.M. Jones wrote:
On Sat, Dec 17, 2011 at 09:22:45AM -0600, Anthony Liguori wrote:
I've even further narrowed it down to the presents or lack of '-vga cirrus'. If you add '-vga cirrus' to the above command line, the guest will boot successfully.
Confirmed: Adding -vga cirrus to the command line cures it too.
That's a strange one :-)
vga sticks out a bit because it's one of the few places where we treat device memory as ram as a performance optimization.
The only time vga has been touched in between v0.15 and v1.0 was during the introduction of the memory API.
It's this commit:
commit d67c3f2cd92aed2247bfa8a9da61a902b7b2ff09 Author: Gerd Hoffmannkraxel@redhat.com Date: Wed Aug 10 17:34:13 2011 +0200
seabios: update to master
This looks like the same issue reported at:
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00029.html
The SeaBIOS fix for this was in rel-1.6.3.1 - but that didn't make QEmu 1.0. Does the problem go away if you upgrade to the newer SeaBIOS version?
Sigh, we really need to be better about updating SeaBIOS in QEMU before release. We had plenty of time to pull in a newer SeaBIOS before 1.0 that would have fixed this :-(
1.6.3.1 was released on Nov 24th, which was actually after the soft feature freeze. We could have pulled 1.6.3 which was Oct 4th but updating the BIOS always results in some interesting things happening so it's not something I like to do unless we have to.
I'd rather have known that this functionality broken before that commit event went in to begin with than allowing it to remain broken until we happened to update past the bug.
We've had multiple releases now where functionality is broken due to QEMU shipping with an older SeaBIOS release than is available upstream.
I think the real issue here is testing. -nodefconfig -nodefaults is used by both libguestfs and libvirt but I'd wager to say that almost noone tests it in QEMU.
I thought about it quite a bit and what I came to was that we need to do a better job of making it easy to test these things, hence qemu-test that I just announced.
Regards,
Anthony Liguori
Regards, Daniel
On Mon, Dec 19, 2011 at 11:34:13AM -0600, Anthony Liguori wrote:
On 12/19/2011 04:31 AM, Daniel P. Berrange wrote:
On Sat, Dec 17, 2011 at 11:49:56AM -0500, Kevin O'Connor wrote:
On Sat, Dec 17, 2011 at 10:24:07AM -0600, Anthony Liguori wrote:
On 12/17/2011 09:25 AM, Richard W.M. Jones wrote:
On Sat, Dec 17, 2011 at 09:22:45AM -0600, Anthony Liguori wrote:
I've even further narrowed it down to the presents or lack of '-vga cirrus'. If you add '-vga cirrus' to the above command line, the guest will boot successfully.
Confirmed: Adding -vga cirrus to the command line cures it too.
That's a strange one :-)
vga sticks out a bit because it's one of the few places where we treat device memory as ram as a performance optimization.
The only time vga has been touched in between v0.15 and v1.0 was during the introduction of the memory API.
It's this commit:
commit d67c3f2cd92aed2247bfa8a9da61a902b7b2ff09 Author: Gerd Hoffmannkraxel@redhat.com Date: Wed Aug 10 17:34:13 2011 +0200
seabios: update to master
This looks like the same issue reported at:
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00029.html
The SeaBIOS fix for this was in rel-1.6.3.1 - but that didn't make QEmu 1.0. Does the problem go away if you upgrade to the newer SeaBIOS version?
Sigh, we really need to be better about updating SeaBIOS in QEMU before release. We had plenty of time to pull in a newer SeaBIOS before 1.0 that would have fixed this :-(
1.6.3.1 was released on Nov 24th, which was actually after the soft feature freeze. We could have pulled 1.6.3 which was Oct 4th but updating the BIOS always results in some interesting things happening so it's not something I like to do unless we have to.
I'd rather have known that this functionality broken before that commit event went in to begin with than allowing it to remain broken until we happened to update past the bug.
We've had multiple releases now where functionality is broken due to QEMU shipping with an older SeaBIOS release than is available upstream.
I think the real issue here is testing. -nodefconfig -nodefaults is used by both libguestfs and libvirt but I'd wager to say that almost noone tests it in QEMU.
I had actually discovered & pointed out this flaw on qemu-devel back in September, and Kevin had the seabios fix by Oct
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00029.html
I hadn't raised it again, because I had mistakenly assumed QEMU will automatically pull in the newer SeaBios release before 1.0 came out. I could have more aggresively bugged people on qemu-devel to update SeaBios, but given your point above about not wanting to rebase Seabios its not clear that would have helped sort this out before 1.0
Regards, Daniel
On 12/19/2011 11:43 AM, Daniel P. Berrange wrote:
On Mon, Dec 19, 2011 at 11:34:13AM -0600, Anthony Liguori wrote:
On 12/19/2011 04:31 AM, Daniel P. Berrange wrote:
Sigh, we really need to be better about updating SeaBIOS in QEMU before release. We had plenty of time to pull in a newer SeaBIOS before 1.0 that would have fixed this :-(
1.6.3.1 was released on Nov 24th, which was actually after the soft feature freeze. We could have pulled 1.6.3 which was Oct 4th but updating the BIOS always results in some interesting things happening so it's not something I like to do unless we have to.
I'd rather have known that this functionality broken before that commit event went in to begin with than allowing it to remain broken until we happened to update past the bug.
We've had multiple releases now where functionality is broken due to QEMU shipping with an older SeaBIOS release than is available upstream.
I think the real issue here is testing. -nodefconfig -nodefaults is used by both libguestfs and libvirt but I'd wager to say that almost noone tests it in QEMU.
I had actually discovered& pointed out this flaw on qemu-devel back in September, and Kevin had the seabios fix by Oct
http://lists.nongnu.org/archive/html/qemu-devel/2011-10/msg00029.html
I hadn't raised it again, because I had mistakenly assumed QEMU will automatically pull in the newer SeaBios release before 1.0 came out. I could have more aggresively bugged people on qemu-devel to update SeaBios, but given your point above about not wanting to rebase Seabios its not clear that would have helped sort this out before 1.0
We really need to update SeaBIOS whenever there is a bug that we know requires an update. Things breakdown because of one or more of the following reasons:
1) User submits a patch to seabios@, Kevin applies it. But that doesn't necessarily trigger anything happening in QEMU.
Ideally, the above mentioned user would submit a submodule update once (1) happens.
2) Kevin fixes something on his own or someone else changes something in the broader SeaBIOS community. That may not even be visible in QEMU.
Syncing right before release isn't a good strategy either because that means we're pulling in something that hasn't been tested extensively at the very tail end of our release cycle.
I would like to point out that August -> October is a pretty long time period for a regression like this to exist. I think that really indicates that the primary problem is testing, not frequency of SeaBIOS updates.
Regards,
Anthony Liguori
Regards, Daniel
On Mon, Dec 19, 2011 at 12:02:59PM -0600, Anthony Liguori wrote:
I would like to point out that August -> October is a pretty long time period for a regression like this to exist. I think that really indicates that the primary problem is testing, not frequency of SeaBIOS updates.
Fair point.
My understanding is we're going to switch to having qemu.git in Fedora Rawhide, which means that libguestfs will always be testing the 'perfect storm' of qemu + kernel + glibc from git (once glibc get their act together anyhow, just qemu + kernel at first).
We usually do a build and a comprehensive test at least once a week, often a few times a week, so we would have picked this up much sooner.
Rich.
On Mon, Dec 19, 2011 at 07:04:54PM +0000, Richard W.M. Jones wrote:
On Mon, Dec 19, 2011 at 12:02:59PM -0600, Anthony Liguori wrote:
I would like to point out that August -> October is a pretty long time period for a regression like this to exist. I think that really indicates that the primary problem is testing, not frequency of SeaBIOS updates.
Fair point.
My understanding is we're going to switch to having qemu.git in Fedora Rawhide, which means that libguestfs will always be testing the 'perfect storm' of qemu + kernel + glibc from git (once glibc get their act together anyhow, just qemu + kernel at first).
We usually do a build and a comprehensive test at least once a week, often a few times a week, so we would have picked this up much sooner.
That wouldn't actually catch this problem, because when we build QEMU in Fedora, we never use the SeaBIOS that QEMU includes in GIT. Fedora always ships the newest SeaBIOS release available from upstream, regardless of what QEMU includes.
Daniel
On Mon, Dec 19, 2011 at 07:16:02PM +0000, Daniel P. Berrange wrote:
On Mon, Dec 19, 2011 at 07:04:54PM +0000, Richard W.M. Jones wrote:
On Mon, Dec 19, 2011 at 12:02:59PM -0600, Anthony Liguori wrote:
I would like to point out that August -> October is a pretty long time period for a regression like this to exist. I think that really indicates that the primary problem is testing, not frequency of SeaBIOS updates.
Fair point.
My understanding is we're going to switch to having qemu.git in Fedora Rawhide, which means that libguestfs will always be testing the 'perfect storm' of qemu + kernel + glibc from git (once glibc get their act together anyhow, just qemu + kernel at first).
We usually do a build and a comprehensive test at least once a week, often a few times a week, so we would have picked this up much sooner.
That wouldn't actually catch this problem, because when we build QEMU in Fedora, we never use the SeaBIOS that QEMU includes in GIT. Fedora always ships the newest SeaBIOS release available from upstream, regardless of what QEMU includes.
Ah yes indeed, I forgot about this.
Nevertheless, it'll at least improve other aspects of our qemu testing :-)
In reply to Anthony: the reason Fedora does this is because binary blobs aren't permitted, no matter what the origin. We have to build SeaBIOS from source, and the choice is made to build from the upstream SeaBIOS source, not from the source release indirectly pointed to by qemu.
Rich.
On Mon, Dec 19, 2011 at 07:40:05PM +0000, Richard W.M. Jones wrote:
In reply to Anthony: the reason Fedora does this is because binary blobs aren't permitted, no matter what the origin. We have to build SeaBIOS from source, and the choice is made to build from the upstream SeaBIOS source, not from the source release indirectly pointed to by qemu.
... as you know already.
Rich.
On 12/19/2011 01:40 PM, Richard W.M. Jones wrote:
On Mon, Dec 19, 2011 at 07:16:02PM +0000, Daniel P. Berrange wrote:
On Mon, Dec 19, 2011 at 07:04:54PM +0000, Richard W.M. Jones wrote:
On Mon, Dec 19, 2011 at 12:02:59PM -0600, Anthony Liguori wrote:
I would like to point out that August -> October is a pretty long time period for a regression like this to exist. I think that really indicates that the primary problem is testing, not frequency of SeaBIOS updates.
Fair point.
My understanding is we're going to switch to having qemu.git in Fedora Rawhide, which means that libguestfs will always be testing the 'perfect storm' of qemu + kernel + glibc from git (once glibc get their act together anyhow, just qemu + kernel at first).
We usually do a build and a comprehensive test at least once a week, often a few times a week, so we would have picked this up much sooner.
That wouldn't actually catch this problem, because when we build QEMU in Fedora, we never use the SeaBIOS that QEMU includes in GIT. Fedora always ships the newest SeaBIOS release available from upstream, regardless of what QEMU includes.
Ah yes indeed, I forgot about this.
Nevertheless, it'll at least improve other aspects of our qemu testing :-)
In reply to Anthony: the reason Fedora does this is because binary blobs aren't permitted, no matter what the origin. We have to build SeaBIOS from source, and the choice is made to build from the upstream SeaBIOS source, not from the source release indirectly pointed to by qemu.
FWIW, we ship SeaBIOS source directly in our release tarballs. There's nothing indirect about it.
Fedora could have a seabios-qemu RPM that was built from the qemu SRPM. Since it ultimately is going to live in /usr/share/qemu, that seems like a nicer thing to do AFAICT.
You could have an alternatives mechanism if people really wanted to use upstream SeaBIOS...
Regards,
Anthony Liguori
Rich.
On Mon, Dec 19, 2011 at 12:02:59PM -0600, Anthony Liguori wrote:
On 12/19/2011 11:43 AM, Daniel P. Berrange wrote:
I hadn't raised it again, because I had mistakenly assumed QEMU will automatically pull in the newer SeaBios release before 1.0 came out. I could have more aggresively bugged people on qemu-devel to update SeaBios, but given your point above about not wanting to rebase Seabios its not clear that would have helped sort this out before 1.0
We really need to update SeaBIOS whenever there is a bug that we know requires an update. Things breakdown because of one or more of the following reasons:
- User submits a patch to seabios@, Kevin applies it. But that
doesn't necessarily trigger anything happening in QEMU.
Ideally, the above mentioned user would submit a submodule update once (1) happens.
- Kevin fixes something on his own or someone else changes
something in the broader SeaBIOS community. That may not even be visible in QEMU.
Syncing right before release isn't a good strategy either because that means we're pulling in something that hasn't been tested extensively at the very tail end of our release cycle.
I would like to point out that August -> October is a pretty long time period for a regression like this to exist. I think that really indicates that the primary problem is testing, not frequency of SeaBIOS updates.
One complication is that alot of us are not necessarily testing the SeaBIOS that is in QEMU GIT. Fedora rawhide includes qemu-kvm.git snapshots which are updated fairly frequently, but we don't use the SeaBIOS QEMU includes. Instead Fedora includes the latest SeaBIOS upstream release.
So Fedora 16/rawhide users would never have seen this particular bug for longer than a couple of weeks until the fixed SeaBIOS arrived.
Regards, Daniel
On 12/19/2011 01:19 PM, Daniel P. Berrange wrote:
On Mon, Dec 19, 2011 at 12:02:59PM -0600, Anthony Liguori wrote:
I would like to point out that August -> October is a pretty long time period for a regression like this to exist. I think that really indicates that the primary problem is testing, not frequency of SeaBIOS updates.
One complication is that alot of us are not necessarily testing the SeaBIOS that is in QEMU GIT. Fedora rawhide includes qemu-kvm.git snapshots which are updated fairly frequently, but we don't use the SeaBIOS QEMU includes. Instead Fedora includes the latest SeaBIOS upstream release.
I understand why Debian and Fedora do this but it is unfortunate from a QA perspective.
Fedora is on a different release schedule than QEMU, so it's easy for it to do an "upstream freeze", and grab the latest SeaBIOS release and QEMU release.
But we don't treat SeaBIOS as a separate package supporting arbitrary combinations of SeaBIOS and QEMU.
Regards,
Anthony Liguori
So Fedora 16/rawhide users would never have seen this particular bug for longer than a couple of weeks until the fixed SeaBIOS arrived.
Regards, Daniel
On Mon, Dec 19, 2011 at 12:02:59PM -0600, Anthony Liguori wrote:
We really need to update SeaBIOS whenever there is a bug that we know requires an update. Things breakdown because of one or more of the following reasons:
- User submits a patch to seabios@, Kevin applies it. But that
doesn't necessarily trigger anything happening in QEMU.
Ideally, the above mentioned user would submit a submodule update once (1) happens.
- Kevin fixes something on his own or someone else changes
something in the broader SeaBIOS community. That may not even be visible in QEMU.
There is another complexity here - it's not always clear to me when a group pulls a particular revision of SeaBIOS. So, knowing who to notify is harder.
Syncing right before release isn't a good strategy either because that means we're pulling in something that hasn't been tested extensively at the very tail end of our release cycle.
Agreed. There has to be a balance here.
There are some USB drive booting fixes along with some ACPI and MPTable changes in SeaBIOS post v1.6.3.1. These changes are a bit large though, so I'm not sure QEMU would be best served by pulling them in if a release is pending.
That said, I'm glad to see users testing recent SeaBIOS revs as it helps greatly with shaking out issues. For example, had QEMU not pulled a revision of SeaBIOS in August, there's a good chance this particular bug would not have been found before the v1.6.3 release and we might still have ended up in the same situation.
I would like to point out that August -> October is a pretty long time period for a regression like this to exist. I think that really indicates that the primary problem is testing, not frequency of SeaBIOS updates.
If we can catch these types of things in test cases, that would be great. This particular bug had a complex set of triggers - it was in SeaBIOS code specific to QEMU (so non-QEMU/KVM users wouldn't find it), using QEMU's default Cirrus VGA driver masks the bug (it happens to have PCI prefmem), and it was an off-by-one in low-level alignment code (a code review wouldn't catch it).
-Kevin
On Mon, Dec 19, 2011 at 10:38:02PM -0500, Kevin O'Connor wrote:
On Mon, Dec 19, 2011 at 12:02:59PM -0600, Anthony Liguori wrote:
We really need to update SeaBIOS whenever there is a bug that we know requires an update. Things breakdown because of one or more of the following reasons:
- User submits a patch to seabios@, Kevin applies it. But that
doesn't necessarily trigger anything happening in QEMU.
Ideally, the above mentioned user would submit a submodule update once (1) happens.
- Kevin fixes something on his own or someone else changes
something in the broader SeaBIOS community. That may not even be visible in QEMU.
There is another complexity here - it's not always clear to me when a group pulls a particular revision of SeaBIOS. So, knowing who to notify is harder.
Why not just pull recent Seabios into QEMU weekly (or as fast as Seabios HEAD moves). Before going into qemu rc stage ask Seabios for formal release.
-- Gleb.