On Tue, Aug 02, 2016 at 04:18:30AM +0000, Xulei (Stone) wrote:
On Fri, Jul 29, 2016 at 04:04:59AM +0000, Xulei (Stone) wrote:
After one day, the vm is stuck. Looking from the following seabios log, it seems seabios stops at "PCI: Using 00:02.0 for primary VGA", and can not execute handle_smp() any more. What may be the reason?
More debugging info would be necessary to find this problem. You could try reproducing and attaching gdb ( http://www.seabios.org/Debugging#Debugging_with_gdb_on_QEMU ). Alternatively, a kvm trace log may help.
kvm trace (seems useful) indicates that cpu 0 keeps always to access
0x00b3 ioport.
0x00b3 is PORT_SMI_STATUS, so i guess my bios is stuck in the function smm_relocate_and_restore { ... /* wait until SMM code executed */ while (inb(PORT_SMI_STATUS) != 0x00) ... }
I'd try adding dprintf() statements around all the code at the top of smm_relocate_and_restore() and enable the dprintf() at the top of handle_smi().
It would also be useful if you can extract the log from the last two working reboots to compare it to the failed case.
Following your suggestion, i'm now sure it is caused by missing SMI. I have tried adding dprintf() like this:
--- a/roms/seabios/src/fw/smm.c +++ b/roms/seabios/src/fw/smm.c @@ -65,7 +65,8 @@ handle_smi(u16 cs) u8 cmd = inb(PORT_SMI_CMD); struct smm_layout *smm = MAKE_FLATPTR(cs, 0); u32 rev = smm->cpu.i32.smm_rev & SMM_REV_MASK; - dprintf(DEBUG_HDL_smi, "handle_smi cmd=%x smbase=%p\n", cmd, smm); + if(cmd == 0x00) { + dprintf(1, "handle_smi cmd=%x smbase=%p\n", cmd, smm); + }
if (smm == (void*)BUILD_SMM_INIT_ADDR) { // relocate SMBASE to 0xa0000 @@ -147,14 +148,14 @@ smm_relocate_and_restore(void) { /* init APM status port */ outb(0x01, PORT_SMI_STATUS); + dprintf(1,"before SMI====\n");
/* raise an SMI interrupt */ outb(0x00, PORT_SMI_CMD); + dprintf(1,"after SMI=====\n");
/* wait until SMM code executed */ while (inb(PORT_SMI_STATUS) != 0x00) ; + dprintf(1,"smm code executes complete====\n");
And the failed case log output like this: 2016-08-03 16:23:15PCI: Using 00:02.0 for primary VGA 2016-08-03 16:23:15smm_device_setup start 2016-08-03 16:23:15init smm 2016-08-03 16:23:15before SMI==== 2016-08-03 16:23:15after SMI=====
So, it's obviously that after outb(0x01, PORT_SMI_STATUS), bios does not handle_smi, so PORT_SMI_STATUS is always 0x01. What's more, when this problem happens, rebooting vm cannot restore it any more. My vm is always stuck at the same place until i destroy it.
And I have already tried kernel commit c43203cab1e which still can not solve this problem. Any idea, Kevin and Paolo?
-Kevin