On Thu, Jan 14, 2016 at 10:36:07AM +0000, Li, Liang Z wrote:
Correct something. The actual parameter for QEMU in our test case is: 'qemu-systerm-x86_64 -enable-kvm -smp 20 -m 2048 -no-acpi -monitor stdio -drive file=/mnt/centos6u6.qcow, if=none, id=foo -device virtio-blk-
pci, drive=foo'
if there is no virtio-blk device: ' qemu-systerm-x86_64 -enable-kvm -smp 20 -m 2048 -no-acpi -monitor stdio -drive file=/mnt/centos6u6.qcow',
guest can boot successfully.
It seems something wrong which make the virtio-blk work incorrectly.
Liang
Subject: about the 'mptable' issue
Hi Kevin,
We just found when starting QEMU with the '-smp 20 -no-acpi' option, the centos6.6 or rhel 7.2 guest failed to boot. By debugging, I found it's your seabios patch, the commit id '9ee2e26255661a' caused the
failure.
I don't know what issue your patch tried to fix, assuming it's the right fix, should we add some prompt to prevent users from using such
an option?
Crash the guest is not a good choice.
Liang
The problem is that the mptable can grow too large to be stored in the 0xf0000-0x100000 region of memory. SeaBIOS allocates 2048 bytes for bios tables and other critical internal storage in that range - it will not allow the mptable to consume more than 600 bytes of that storage (patch 9ee2e262). If the code allows the mptable to grow unbounded then other critical allocations can fail with undesirable results.
It is possible to allocate the mptable in memory above the first 1meg where the mptable size could be unbounded, but that causes Linux versions earlier than v2.6.30 to crash.
It is unlikely that virtio-blk has any impact on this issue - the additional device probably just caused the mptable to slightly exceed 600 bytes.
It is unlikely that the lack of the mptable is directly causing a Linux crash - more likely is that with neither an mptable nor an acpi table there is insufficient information for Linux to enumerate the machine's hardware.
My initial reaction is to avoid -no-acpi on machines with 20+ cpus. Is there a reason to turn acpi off?
-Kevin
Thanks for your expiation. There no strong reason for this. Our QA team reported this bug, you know, for QA, they will test all the case.
Do you think it's reasonable to prevent user from using '-no-acpi' with 20+ cpus?
Liang