Hi,
on a 4cpu K8 system I get the following output before the system hangs..
Is the machine drunk or are CPU1-3 just running away?
LinuxBIOS-1.1.7.0-fallback Tue Nov 2 20:34:14 CET 2004 starting... setting up resource map....done. Enabling routing table for node 00 done. Enabling SMP settings setup_remote_node: done Renaming current temporary node to 01 done. anabling routing table for node 01 dConlee. tRiennamgin g mcutrrrenrt emporary node to 02 done. Enabling routing table for node 02 dConle.e mRernaiminngg c urmretntr tre porary node to 03 done. Enabling routing table for node 03 dColnee.a iz04n ngod esm itnirtiral ed. coherent_ht_finalize done PCI_DEV=(0,0x18,0) 0xb4=00030000 bus=00 id =74501022 bus=00 id =74601022 PCI_DEV=(0,0x19,0) 0xd4=000a0400
Stefan
Stefan Reinauer stepan@openbios.org writes:
Hi,
on a 4cpu K8 system I get the following output before the system hangs..
Hmm. It looks like you have non-bootstrap cpus running through the code.
Usually it is this hunk of code that traps errors. if (!boot_cpu()) { stop_this_cpu();
Although I can see something very weird happening if you have BIST failure, on your secondary cpus. But that is quite unlikely.
Is the machine drunk or are CPU1-3 just running away?
It looks like the cpus are just running away :( Usually I would expect to interlaced characters form a message I recognize. But I don't recognize see anything I recognize here.
Eric
LinuxBIOS-1.1.7.0-fallback Tue Nov 2 20:34:14 CET 2004 starting... setting up resource map....done. Enabling routing table for node 00 done. Enabling SMP settings setup_remote_node: done Renaming current temporary node to 01 done. anabling routing table for node 01 dConlee. tRiennamgin g mcutrrrenrt emporary node to 02 done. Enabling routing table for node 02 dConle.e mRernaiminngg c urmretntr tre porary node to 03 done. Enabling routing table for node 03 dColnee.a iz04n ngod esm itnirtiral ed. coherent_ht_finalize done PCI_DEV=(0,0x18,0) 0xb4=00030000 bus=00 id =74501022 bus=00 id =74601022 PCI_DEV=(0,0x19,0) 0xd4=000a0400
* Eric W. Biederman ebiederman@lnxi.com [041102 22:05]:
Hmm. It looks like you have non-bootstrap cpus running through the code.
Usually it is this hunk of code that traps errors. if (!boot_cpu()) { stop_this_cpu();
Although I can see something very weird happening if you have BIST failure, on your secondary cpus. But that is quite unlikely.
But that is something that would be reported, somehow, before setting the resource map, no?
Is the machine drunk or are CPU1-3 just running away?
It looks like the cpus are just running away :( Usually I would expect to interlaced characters form a message I recognize. But I don't recognize see anything I recognize here.
Looking closer, it almost looks like Clearing mtrr
Renaming current temporary node to 01 done. anabling routing table for node 01 dConlee.
C l e
tRiennamgin g mcutrrrenrt
t i n g g m tr r
as in print_spew("Clearing mtrr\r\n"); of do_early_mtrr_init.c
Ha, it looks like noone had a chance to compile with spew messages in quite a while ;-) mtrr setup is long called before stop_cpu. So the messages seem to be unrelated to the hang..
emporary node to 02 done. Enabling routing table for node 02 dConle.e mRernaiminngg c urmretntr tre porary node to 03 done. Enabling routing table for node 03 dColnee.a iz04n ngod esm itnirtiral ed. coherent_ht_finalize done PCI_DEV=(0,0x18,0) 0xb4=00030000 bus=00 id =74501022 bus=00 id =74601022 PCI_DEV=(0,0x19,0) 0xd4=000a0400
Stefan Reinauer stepan@openbios.org writes:
- Eric W. Biederman ebiederman@lnxi.com [041102 22:05]:
Hmm. It looks like you have non-bootstrap cpus running through the code.
Usually it is this hunk of code that traps errors. if (!boot_cpu()) { stop_this_cpu();
Although I can see something very weird happening if you have BIST failure, on your secondary cpus. But that is quite unlikely.
But that is something that would be reported, somehow, before setting the resource map, no?
Not on the second cpu...
Is the machine drunk or are CPU1-3 just running away?
It looks like the cpus are just running away :( Usually I would expect to interlaced characters form a message I recognize. But I don't recognize see anything I recognize here.
Looking closer, it almost looks like Clearing mtrr
Renaming current temporary node to 01 done. anabling routing table for node 01 dConlee.
C l e
tRiennamgin g mcutrrrenrt
t i n g g m tr r
as in print_spew("Clearing mtrr\r\n"); of do_early_mtrr_init.c
Ha, it looks like noone had a chance to compile with spew messages in quite a while ;-) mtrr setup is long called before stop_cpu. So the messages seem to be unrelated to the hang..
Cool. I wonder if we want to kill those messages. In the normal course of affairs I can't see how they would help...
Eric