I've got linuxbios somewhat working with the SuperMicro X5DPE-G2, but I have discovered one problem.
If I run lm_sensors and initialize the chips, the node will not reboot. It gives me a no memory error at boot.
Has anyone seen this? How do I start debuggin ?
Denis
On 18 Apr 2003, Denis Pilon wrote:
If I run lm_sensors and initialize the chips, the node will not reboot. It gives me a no memory error at boot.
consider getting rid of lm_sensors first.
Has anyone seen this? How do I start debuggin ?
lm_sensors is messing up the SPD bus for linuxbios. I think we're going to need to fix linuxbios so that it will always undo whatever lm_sensors is doing.
See if you can see what lm_sensors is doing to the southbridge to change the I2C bus around.
ron
Hello ron,
Friday, April 18, 2003, 11:42:29 PM, you wrote:
rm> On 18 Apr 2003, Denis Pilon wrote:
If I run lm_sensors and initialize the chips, the node will not reboot. It gives me a no memory error at boot.
rm> consider getting rid of lm_sensors first.
Personally, I'd prefer LinuxBIOS to be compatible with lm_sensors. It's very pity to miss such a nice part of hardware as monitoring. What's the problem with supporting it?
With best regards, Alexander mailto:spirit@reactor.ru
On Sat, 19 Apr 2003, Alexander Amelkin wrote:
Personally, I'd prefer LinuxBIOS to be compatible with lm_sensors. It's very pity to miss such a nice part of hardware as monitoring. What's the problem with supporting it?
monitoring is good. Especially if it is good monitoring. Lm_sensors is not good enough for us. I think the supermon framework for hardware monitors is lots better.
linuxbios is compatible with lm_sensors, but on this one platform, lm_sensors is breaking linuxbios somehow. It will get fixed.
ron
Hello ron,
Saturday, April 19, 2003, 1:04:20 AM, you wrote:
rm> On Sat, 19 Apr 2003, Alexander Amelkin wrote:
Personally, I'd prefer LinuxBIOS to be compatible with lm_sensors. It's very pity to miss such a nice part of hardware as monitoring. What's the problem with supporting it?
rm> monitoring is good. Especially if it is good monitoring. Lm_sensors is not rm> good enough for us. I think the supermon framework for hardware monitors rm> is lots better.
Thanks for the hint. I'll try the supermon.
rm> linuxbios is compatible with lm_sensors, but on this one platform, rm> lm_sensors is breaking linuxbios somehow. It will get fixed.
Well... on sis630 (pcchips 787cl+) lm_sensors don't show anything with linuxbios, while with AMIBIOS they do show lots of things.
A little off-topic... I was quite amazed when I found that on EPIA ESP5000 there is a monitoring chip detected by lm_sensors, but no actual sensors present. :) Anyone knows if that is really so, or is that just a bug of lm_sensors?
With best regards, Alexander mailto:spirit@reactor.ru
On Sat, 19 Apr 2003, Alexander Amelkin wrote:
Well... on sis630 (pcchips 787cl+) lm_sensors don't show anything with linuxbios, while with AMIBIOS they do show lots of things.
that's interesting. we need to figure this one out.
A little off-topic... I was quite amazed when I found that on EPIA ESP5000 there is a monitoring chip detected by lm_sensors, but no actual sensors present. :) Anyone knows if that is really so, or is that just a bug of lm_sensors?
we've seen our share of bugs with lm_sensors.
ron
Greetings,
The sensors in the sis950 (superio) have to be turned on. Apparently, the AMIBIOS does that, so lm_sensors doesn't try to initialize it. The standalone supermon driver I wrote for that does the initialization itself.
G'day, sjames
On Fri, 18 Apr 2003, ron minnich wrote:
On Sat, 19 Apr 2003, Alexander Amelkin wrote:
Well... on sis630 (pcchips 787cl+) lm_sensors don't show anything with linuxbios, while with AMIBIOS they do show lots of things.
that's interesting. we need to figure this one out.
A little off-topic... I was quite amazed when I found that on EPIA ESP5000 there is a monitoring chip detected by lm_sensors, but no actual sensors present. :) Anyone knows if that is really so, or is that just a bug of lm_sensors?
we've seen our share of bugs with lm_sensors.
ron
Linuxbios mailing list Linuxbios@clustermatic.org http://www.clustermatic.org/mailman/listinfo/linuxbios
Hello steven,
Saturday, April 19, 2003, 4:20:32 PM, you wrote:
sj> Greetings,
sj> The sensors in the sis950 (superio) have to be turned on. Apparently, the sj> AMIBIOS does that, so lm_sensors doesn't try to initialize it. The sj> standalone supermon driver I wrote for that does the initialization sj> itself.
Supermon is described as 'high-speed cluster monitoring' thing. What if I don't run any clusters? Just a standalone embedded device? The word 'cluster' scares me a little. :) Is supermon small enough to fit in my 16Mb Linux configuration instead of lm_sensors? I think I don't need anything 'high-speed'. I just want to check temperature, fan and voltage parameters from time to time.
With best regards, Alexander mailto:spirit@reactor.ru
Greetings,
Supermon was developed for clustering, but really is fairly general purpose (I'm looking into using it on a variety of servers).
It consists of a few parts. Supermon itself and mon are daemons. mon gathers data from the kernel and passes it up to supermon. supermon simply aggregates the data and serves it to various clients.
The other two parts are supermon_proc and a sensor module. The sensor module just presents the sensor data in /proc/sys/supermon_sensors in the form of an S expression (a Lisp like expression, easily parsable and human readable as well).
G'day, sjames
On Sat, 19 Apr 2003, Alexander Amelkin wrote:
Hello steven,
Saturday, April 19, 2003, 4:20:32 PM, you wrote:
sj> Greetings,
sj> The sensors in the sis950 (superio) have to be turned on. Apparently, the sj> AMIBIOS does that, so lm_sensors doesn't try to initialize it. The sj> standalone supermon driver I wrote for that does the initialization sj> itself.
Supermon is described as 'high-speed cluster monitoring' thing. What if I don't run any clusters? Just a standalone embedded device? The word 'cluster' scares me a little. :) Is supermon small enough to fit in my 16Mb Linux configuration instead of lm_sensors? I think I don't need anything 'high-speed'. I just want to check temperature, fan and voltage parameters from time to time.
With best regards, Alexander mailto:spirit@reactor.ru
Linuxbios mailing list Linuxbios@clustermatic.org http://www.clustermatic.org/mailman/listinfo/linuxbios
supermon drivers are smaller and far less complex than lm_sensors drivers, and they run lots faster.
ron
Hello ron,
Sunday, April 20, 2003, 12:06:57 AM, you wrote:
rm> supermon drivers are smaller and far less complex than lm_sensors drivers, rm> and they run lots faster.
Right, but I found that they just won't create anything in the /proc filesystem of my router with a 2.4.20 kernel. Symdm won't create anything as well. :( I think, though, that I must move with this to the supermon list. Or just stick to lm_sensors, which work w/o problems for me.
With best regards, Alexander mailto:spirit@reactor.ru
On Sun, 20 Apr 2003, Alexander Amelkin wrote:
Right, but I found that they just won't create anything in the /proc filesystem of my router with a 2.4.20 kernel. Symdm won't create anything as well. :( I think, though, that I must move with this to the supermon list. Or just stick to lm_sensors, which work w/o problems for me.
ask the supermon list, although we have not seen this problem.
Or stick with lm_sensors :-)
use what works.
ron
Sounds very similar to a problem observed on the SuperMicro P4DPR/DPE which has the same winbond sensor chip. The solution was to tell lm_sensors to access the Winbond via ISA rather than I2C.
I think there is also some code in the p4dpr/p4dpe linuxbios port to notice the memory initialization problem and work around it (e.g. wait a while and reset?) The details have faded but you might check the code.
Jim
On Fri, 18 Apr 2003, Denis Pilon wrote:
I've got linuxbios somewhat working with the SuperMicro X5DPE-G2, but I have discovered one problem.
If I run lm_sensors and initialize the chips, the node will not reboot. It gives me a no memory error at boot.
Has anyone seen this? How do I start debuggin ?
Denis
On Fri, 18 Apr 2003, Jim Garlick wrote:
Sounds very similar to a problem observed on the SuperMicro P4DPR/DPE which has the same winbond sensor chip. The solution was to tell lm_sensors to access the Winbond via ISA rather than I2C.
Or do what I did: chuck lm_sensors completely and write a simple little driver using supermon, with much help from Steve James. Way better than that lm_sensors stuff.
ron
Thanks Jim.
Removing the i2c prevented the problem.
But I will try to see where lm_sensors is messing up the bus.
Denis
On Fri, 2003-04-18 at 15:48, Jim Garlick wrote:
Sounds very similar to a problem observed on the SuperMicro P4DPR/DPE which has the same winbond sensor chip. The solution was to tell lm_sensors to access the Winbond via ISA rather than I2C.
I think there is also some code in the p4dpr/p4dpe linuxbios port to notice the memory initialization problem and work around it (e.g. wait a while and reset?) The details have faded but you might check the code.
Jim
On Fri, 18 Apr 2003, Denis Pilon wrote:
I've got linuxbios somewhat working with the SuperMicro X5DPE-G2, but I have discovered one problem.
If I run lm_sensors and initialize the chips, the node will not reboot. It gives me a no memory error at boot.
Has anyone seen this? How do I start debuggin ?
Denis
Jim Garlick garlick@llnl.gov writes:
Sounds very similar to a problem observed on the SuperMicro P4DPR/DPE which has the same winbond sensor chip. The solution was to tell lm_sensors to access the Winbond via ISA rather than I2C.
I think there is also some code in the p4dpr/p4dpe linuxbios port to notice the memory initialization problem and work around it (e.g. wait a while and reset?) The details have faded but you might check the code.
So the details. On Intels ICH3 southbridge there is an smbus/i2c controller that when it sees invalid on the bus it locks up. But it only sees invalid data when you are actually using the bus. Using lm_sensors just increases the probability that the smbus controller will lock up.
At which point a reboot will fail because SPD information cannot be read over the smbus, leading the LinuxBIOS to think you have no RAM.
To the best of my knowledge there is no fix. This is just something that needs to be avoided.
Eric
Greetings,
If you're using the linuxbios_reset (kernel linuxbios patch), you can have reset write 0x0e to 0xcf9. That will do a 3 second poweroff instead of a simple reset. That should avoid the issue at least.
G'day, sjames
On 18 Apr 2003, Eric W. Biederman wrote:
Jim Garlick garlick@llnl.gov writes:
Sounds very similar to a problem observed on the SuperMicro P4DPR/DPE which has the same winbond sensor chip. The solution was to tell lm_sensors to access the Winbond via ISA rather than I2C.
I think there is also some code in the p4dpr/p4dpe linuxbios port to notice the memory initialization problem and work around it (e.g. wait a while and reset?) The details have faded but you might check the code.
So the details. On Intels ICH3 southbridge there is an smbus/i2c controller that when it sees invalid on the bus it locks up. But it only sees invalid data when you are actually using the bus. Using lm_sensors just increases the probability that the smbus controller will lock up.
At which point a reboot will fail because SPD information cannot be read over the smbus, leading the LinuxBIOS to think you have no RAM.
To the best of my knowledge there is no fix. This is just something that needs to be avoided.
Eric
Linuxbios mailing list Linuxbios@clustermatic.org http://www.clustermatic.org/mailman/listinfo/linuxbios
steven james pyro@linuxlabs.com writes:
Greetings,
If you're using the linuxbios_reset (kernel linuxbios patch), you can have reset write 0x0e to 0xcf9. That will do a 3 second poweroff instead of a simple reset. That should avoid the issue at least.
A) lm_sensors would still not be reliable. B) It would quite possibly undo the fix for the PCIH2 that causes the boards to lock up while scanning the PCI bus. And without a cluster like MCR to test on I don't feel comfortable doing that.
Eric
Greetings,
I've been doing that for some time now. I haven't had a great problem, though the sample size is smaller than MCR. How is the OS triggering a power off followed by a cold start going to screw up PCIH2?
G'day, sjames
On 18 Apr 2003, Eric W. Biederman wrote:
steven james pyro@linuxlabs.com writes:
Greetings,
If you're using the linuxbios_reset (kernel linuxbios patch), you can have reset write 0x0e to 0xcf9. That will do a 3 second poweroff instead of a simple reset. That should avoid the issue at least.
A) lm_sensors would still not be reliable. B) It would quite possibly undo the fix for the PCIH2 that causes the boards to lock up while scanning the PCI bus. And without a cluster like MCR to test on I don't feel comfortable doing that.
Eric