Hello,
I have recently upgraded my dual MSI-6120 from 2xCeleron (Mendocino) 300@450MHz with 2xMSI-6905 slot 370 to slot 1 adapters to one Celeron2 1.3GHz (Tualatin) using a Slot-T adapter card. The plan is to equip the mobo with 2xPIII or preferably 2xVIA C3 Nehemiah (when SMP capable).
The problem is that the latest MSI BIOS (v2.0) for the mobo does not support Coppermine/Tualatin processors. The single GNU/Linux kernels boot without problems but _extremely_ slowly, at least with a speed reduction by a factor 10. hdparm -tT gives around 20/2 compared with 250/25 with the Mendocino CPU(s). For example uncompressing the kernel takes minute(s) compared to seconds. After some trials and web searching I suspect that the problem is with the level 2 cache not activated by the BIOS.
I see the you have code in the linuxbios for activating caches.
i) Does this code work for 440BX motherboards? ii) Is it possible to extract this code and try out after the kernel has booted (slowly), to verify my assumption? iii) Is there some other tool available for cache activation? iv) One interesting continuation would be to try to replace the MSI (AMI) BIOS with linuxbios, but as a first step I think this would be a little risky.
Any ideas?
Thanks, Svante
On Tue, 30 Sep 2003, Svante Signell wrote:
i) Does this code work for 440BX motherboards?
it's processor-dependent, 440bx or not is not an issue.
ii) Is it possible to extract this code and try out after the kernel has booted (slowly), to verify my assumption?
yes, we tested it that way. You can try it.
iii) Is there some other tool available for cache activation?
Not sure.
iv) One interesting continuation would be to try to replace the MSI (AMI) BIOS with linuxbios, but as a first step I think this would be a little risky.
well, so far, given the track record of many of these BIOSes, I'm not sure how risky that is ...
ron
Ron,
Thank you for your reply. Maybe there is hope after all.
On Wed, 2003-10-01 at 01:34, ron minnich wrote:
On Tue, 30 Sep 2003, Svante Signell wrote:
i) Does this code work for 440BX motherboards?
it's processor-dependent, 440bx or not is not an issue.
Thanks for the info. To clarify I'll split the question into three: i) Does LinuxBIOS work for 440BX-based mother-boards, single and dual? Downloading the code from CVS shows support for Intel L440GX+ and a patch for linux-2.4.13, not 440BX or kernels later than 2.4.13. Also, I did not find anything about MSI mainboards. ii) Does the cache activation code work for Mendocino, Coppermine, Tualatin and newer Intel processors? Will it work for the VIA C3 Nehemiah? iii) How much of the boot process in GNU/Linux the BIOS responsible for? I thought that the kernel was only dependent on the BIOS for a few functions, such as different HW initialisations: CPU, memory, disks, etc compared to Windows 9x etc. Any pointers?
ii) Is it possible to extract this code and try out after the kernel has booted (slowly), to verify my assumption?
yes, we tested it that way. You can try it.
I will try. Which files do I need in addition to src/cpu/p6/l2_cache.c?
iii) Is there some other tool available for cache activation?
Not sure.
iv) One interesting continuation would be to try to replace the MSI (AMI) BIOS with linuxbios, but as a first step I think this would be a little risky.
well, so far, given the track record of many of these BIOSes, I'm not sure how risky that is ...
With risks I meant the chance of being left with a dead motherboard... I'm always nervous when flashing the BIOS that something will happen, for example a sudden power loss, regardless of where the BIOS originates from.
ron
BTW: Why is this work called LinuxBIOS (except maybe for historical reasons). Will other OSes (eg GNU/Hurd) boot with LinuxBIOS now or in the future? Maybe then something like FreeBIOS should be used instead.
Thanks, Svante
On Wed, 1 Oct 2003, Svante Signell wrote:
i) Does LinuxBIOS work for 440BX-based mother-boards, single and dual? Downloading the code from CVS shows support for Intel L440GX+ and a patch for linux-2.4.13, not 440BX or kernels later than 2.4.13. Also, I did not find anything about MSI mainboards.
single are tested. Dual I don't know.
ii) Does the cache activation code work for Mendocino, Coppermine, Tualatin and newer Intel processors? Will it work for the VIA C3 Nehemiah?
It was only needed for PII. Coppermine and later -- "Just works". It is extremely cpu-dependent.
iii) How much of the boot process in GNU/Linux the BIOS responsible for? I thought that the kernel was only dependent on the BIOS for a few functions, such as different HW initialisations: CPU, memory, disks, etc compared to Windows 9x etc. Any pointers?
that's about right.
I will try. Which files do I need in addition to src/cpu/p6/l2_cache.c?
none. You have to turn that back into a main() but it should be fine.
With risks I meant the chance of being left with a dead motherboard... I'm always nervous when flashing the BIOS that something will happen, for example a sudden power loss, regardless of where the BIOS originates from.
never do this kind of work without a spare bios part. Never.
BTW: Why is this work called LinuxBIOS (except maybe for historical reasons). Will other OSes (eg GNU/Hurd) boot with LinuxBIOS now or in the future? Maybe then something like FreeBIOS should be used instead.
It was called linuxbios for a simple reason: linux was going to be the bios. linux would be in flash, linux would boot the oses.
Small flashes have caused changes in course in some cases, but the name has stuck anyway. Now that vendors have joined in, changing the name would be hard.
ron
Hi,
Sorry for taking up this thread again but now I have made a test of the l2_cache activation code and have some further questions.
The files put together to make things build are l2_cache.c, printk.c, vsprintf.c, subr.c and corresponding header files from the linuxbios CVS tree. For subr.c I had to add an include (#include <sys/io.h>) to get outb defined for linking. The result so far is a segfault, in the cache_enable() inline assembly routine in l2_cache.c)
0. How to test this code after a _slow_ boot outside the BIOS? Is single user mode sufficient, i.e. init 1?
1. How are these printk statements supposed to work? Is the output directed to some system logfile, like kern.log? How to define this logfile etc. What to change if I want to log debug outputs to the standard out and/or standard err? I don't find any output when running the main program, neither in the system log files or on the screen.
2. Any special compiler and linker switches needed, like -nostdinc, -nostdlib, -nostartfiles, etc? Your build system is Python based, right, so I cannot easily look at Makefiles in the CVS tree.
3. I found where the program halts with gdb and compiling with debug set. One way to trace is single stepping in gdb etc. What is supposed to happen when the DEBUG is defined in l2_cache.c?
4. You state that the L2 cache stuff is only needed for P2 CPUs, not for P3 type CPUs, such as Coppermine or Tualatin. I'm testing with a Celeron 2 CPU (Tualatin), which is of P3 type. What if the BIOS does not recognise the CPU and disables the L2 cache? People claim that AMI BIOSes work this way. It the enabling code sufficient to make things work.
5. If the slowness is not due to a disabled L2 cache (how to test this properly btw?), can the problems be solved by tying with the mtrr or microcode update code?
6. Maybe the problem is still hardware related, like the on-board voltage regulator for the CPU is not working properly, even if there are no indications at all from the on board sensors. However, if the problems are software related and can be solved, do you think it is feasible to replace the AMI BIOS with LinuxBIOS? The probability of getting an updated BIOS from MSI supporting Coppermine and Tualatin processors is probably zero.
Thanks, Svante
On Wed, 2003-10-01 at 15:40, ron minnich wrote:
On Wed, 1 Oct 2003, Svante Signell wrote:
i) Does LinuxBIOS work for 440BX-based mother-boards, single and dual? Downloading the code from CVS shows support for Intel L440GX+ and a patch for linux-2.4.13, not 440BX or kernels later than 2.4.13. Also, I did not find anything about MSI mainboards.
single are tested. Dual I don't know.
ii) Does the cache activation code work for Mendocino, Coppermine, Tualatin and newer Intel processors? Will it work for the VIA C3 Nehemiah?
It was only needed for PII. Coppermine and later -- "Just works". It is extremely cpu-dependent.
iii) How much of the boot process in GNU/Linux the BIOS responsible for? I thought that the kernel was only dependent on the BIOS for a few functions, such as different HW initialisations: CPU, memory, disks, etc compared to Windows 9x etc. Any pointers?
that's about right.
I will try. Which files do I need in addition to src/cpu/p6/l2_cache.c?
none. You have to turn that back into a main() but it should be fine.
With risks I meant the chance of being left with a dead motherboard... I'm always nervous when flashing the BIOS that something will happen, for example a sudden power loss, regardless of where the BIOS originates from.
never do this kind of work without a spare bios part. Never.
BTW: Why is this work called LinuxBIOS (except maybe for historical reasons). Will other OSes (eg GNU/Hurd) boot with LinuxBIOS now or in the future? Maybe then something like FreeBIOS should be used instead.
It was called linuxbios for a simple reason: linux was going to be the bios. linux would be in flash, linux would boot the oses.
Small flashes have caused changes in course in some cases, but the name has stuck anyway. Now that vendors have joined in, changing the name would be hard.
ron
Greetings,
To run that code inside linux, you need to add a call to iopl to allow direct hardware access like:
res = iopl(3); if(res) { report_error(); exit(-1); }
or something to that effect. G'day, sjames
On Thu, 6 Nov 2003, Svante Signell wrote:
Hi,
Sorry for taking up this thread again but now I have made a test of the l2_cache activation code and have some further questions.
The files put together to make things build are l2_cache.c, printk.c, vsprintf.c, subr.c and corresponding header files from the linuxbios CVS tree. For subr.c I had to add an include (#include <sys/io.h>) to get outb defined for linking. The result so far is a segfault, in the cache_enable() inline assembly routine in l2_cache.c)
- How to test this code after a _slow_ boot outside the BIOS? Is single
user mode sufficient, i.e. init 1?
- How are these printk statements supposed to work? Is the output
directed to some system logfile, like kern.log? How to define this logfile etc. What to change if I want to log debug outputs to the standard out and/or standard err? I don't find any output when running the main program, neither in the system log files or on the screen.
- Any special compiler and linker switches needed, like -nostdinc,
-nostdlib, -nostartfiles, etc? Your build system is Python based, right, so I cannot easily look at Makefiles in the CVS tree.
- I found where the program halts with gdb and compiling with debug
set. One way to trace is single stepping in gdb etc. What is supposed to happen when the DEBUG is defined in l2_cache.c?
- You state that the L2 cache stuff is only needed for P2 CPUs, not for
P3 type CPUs, such as Coppermine or Tualatin. I'm testing with a Celeron 2 CPU (Tualatin), which is of P3 type. What if the BIOS does not recognise the CPU and disables the L2 cache? People claim that AMI BIOSes work this way. It the enabling code sufficient to make things work.
- If the slowness is not due to a disabled L2 cache (how to test this
properly btw?), can the problems be solved by tying with the mtrr or microcode update code?
- Maybe the problem is still hardware related, like the on-board
voltage regulator for the CPU is not working properly, even if there are no indications at all from the on board sensors. However, if the problems are software related and can be solved, do you think it is feasible to replace the AMI BIOS with LinuxBIOS? The probability of getting an updated BIOS from MSI supporting Coppermine and Tualatin processors is probably zero.
Thanks, Svante
On Wed, 2003-10-01 at 15:40, ron minnich wrote:
On Wed, 1 Oct 2003, Svante Signell wrote:
i) Does LinuxBIOS work for 440BX-based mother-boards, single and dual? Downloading the code from CVS shows support for Intel L440GX+ and a patch for linux-2.4.13, not 440BX or kernels later than 2.4.13. Also, I did not find anything about MSI mainboards.
single are tested. Dual I don't know.
ii) Does the cache activation code work for Mendocino, Coppermine, Tualatin and newer Intel processors? Will it work for the VIA C3 Nehemiah?
It was only needed for PII. Coppermine and later -- "Just works". It is extremely cpu-dependent.
iii) How much of the boot process in GNU/Linux the BIOS responsible for? I thought that the kernel was only dependent on the BIOS for a few functions, such as different HW initialisations: CPU, memory, disks, etc compared to Windows 9x etc. Any pointers?
that's about right.
I will try. Which files do I need in addition to src/cpu/p6/l2_cache.c?
none. You have to turn that back into a main() but it should be fine.
With risks I meant the chance of being left with a dead motherboard... I'm always nervous when flashing the BIOS that something will happen, for example a sudden power loss, regardless of where the BIOS originates from.
never do this kind of work without a spare bios part. Never.
BTW: Why is this work called LinuxBIOS (except maybe for historical reasons). Will other OSes (eg GNU/Hurd) boot with LinuxBIOS now or in the future? Maybe then something like FreeBIOS should be used instead.
It was called linuxbios for a simple reason: linux was going to be the bios. linux would be in flash, linux would boot the oses.
Small flashes have caused changes in course in some cases, but the name has stuck anyway. Now that vendors have joined in, changing the name would be hard.
ron
Linuxbios mailing list Linuxbios@clustermatic.org http://www.clustermatic.org/mailman/listinfo/linuxbios
Steven,
Thanks for the tip, I'll try adding this in. Preliminary estimations with lmbench-2.0 shows like the problems are probably due to the missing L2 cache. I'm currently compiling and running running lmbench-3, but with an efficient speed of 7MHz instead of 1300MHz, things take time...
BTW: If one is picky, shouldn't the test be like if(res == -1)? The man page of iopl says: On success, zero is returned. On error, -1 is returned, and errno is set appropriately. But of course, all values not equal to 0 means "true", right?
On Thu, 2003-11-06 at 14:59, steven james wrote:
Greetings,
To run that code inside linux, you need to add a call to iopl to allow direct hardware access like:
res = iopl(3); if(res) { report_error(); exit(-1); }
or something to that effect. G'day, sjames
On Thu, 6 Nov 2003, Svante Signell wrote:
Hi,
Sorry for taking up this thread again but now I have made a test of the l2_cache activation code and have some further questions.
The files put together to make things build are l2_cache.c, printk.c, vsprintf.c, subr.c and corresponding header files from the linuxbios CVS tree. For subr.c I had to add an include (#include <sys/io.h>) to get outb defined for linking. The result so far is a segfault, in the cache_enable() inline assembly routine in l2_cache.c)
- How to test this code after a _slow_ boot outside the BIOS? Is single
user mode sufficient, i.e. init 1?
- How are these printk statements supposed to work? Is the output
directed to some system logfile, like kern.log? How to define this logfile etc. What to change if I want to log debug outputs to the standard out and/or standard err? I don't find any output when running the main program, neither in the system log files or on the screen.
- Any special compiler and linker switches needed, like -nostdinc,
-nostdlib, -nostartfiles, etc? Your build system is Python based, right, so I cannot easily look at Makefiles in the CVS tree.
- I found where the program halts with gdb and compiling with debug
set. One way to trace is single stepping in gdb etc. What is supposed to happen when the DEBUG is defined in l2_cache.c?
- You state that the L2 cache stuff is only needed for P2 CPUs, not for
P3 type CPUs, such as Coppermine or Tualatin. I'm testing with a Celeron 2 CPU (Tualatin), which is of P3 type. What if the BIOS does not recognise the CPU and disables the L2 cache? People claim that AMI BIOSes work this way. It the enabling code sufficient to make things work.
- If the slowness is not due to a disabled L2 cache (how to test this
properly btw?), can the problems be solved by tying with the mtrr or microcode update code?
- Maybe the problem is still hardware related, like the on-board
voltage regulator for the CPU is not working properly, even if there are no indications at all from the on board sensors. However, if the problems are software related and can be solved, do you think it is feasible to replace the AMI BIOS with LinuxBIOS? The probability of getting an updated BIOS from MSI supporting Coppermine and Tualatin processors is probably zero.
Thanks, Svante
Greetings,
Yes, anything non 0 is true. Testing that way (or if(res<0) when the function is to return a count) generally helps to catch wierdness (in the bad old days, some functions returned -errno or even errno on error but always 0 on success, this catches all of those cases).
G'day, sjames
On Fri, 7 Nov 2003, Svante Signell wrote:
Steven,
Thanks for the tip, I'll try adding this in. Preliminary estimations with lmbench-2.0 shows like the problems are probably due to the missing L2 cache. I'm currently compiling and running running lmbench-3, but with an efficient speed of 7MHz instead of 1300MHz, things take time...
BTW: If one is picky, shouldn't the test be like if(res == -1)? The man page of iopl says: On success, zero is returned. On error, -1 is returned, and errno is set appropriately. But of course, all values not equal to 0 means "true", right?
On Thu, 2003-11-06 at 14:59, steven james wrote:
Greetings,
To run that code inside linux, you need to add a call to iopl to allow direct hardware access like:
res = iopl(3); if(res) { report_error(); exit(-1); }
or something to that effect. G'day, sjames
On Thu, 6 Nov 2003, Svante Signell wrote:
Hi,
Sorry for taking up this thread again but now I have made a test of the l2_cache activation code and have some further questions.
The files put together to make things build are l2_cache.c, printk.c, vsprintf.c, subr.c and corresponding header files from the linuxbios CVS tree. For subr.c I had to add an include (#include <sys/io.h>) to get outb defined for linking. The result so far is a segfault, in the cache_enable() inline assembly routine in l2_cache.c)
- How to test this code after a _slow_ boot outside the BIOS? Is single
user mode sufficient, i.e. init 1?
- How are these printk statements supposed to work? Is the output
directed to some system logfile, like kern.log? How to define this logfile etc. What to change if I want to log debug outputs to the standard out and/or standard err? I don't find any output when running the main program, neither in the system log files or on the screen.
- Any special compiler and linker switches needed, like -nostdinc,
-nostdlib, -nostartfiles, etc? Your build system is Python based, right, so I cannot easily look at Makefiles in the CVS tree.
- I found where the program halts with gdb and compiling with debug
set. One way to trace is single stepping in gdb etc. What is supposed to happen when the DEBUG is defined in l2_cache.c?
- You state that the L2 cache stuff is only needed for P2 CPUs, not for
P3 type CPUs, such as Coppermine or Tualatin. I'm testing with a Celeron 2 CPU (Tualatin), which is of P3 type. What if the BIOS does not recognise the CPU and disables the L2 cache? People claim that AMI BIOSes work this way. It the enabling code sufficient to make things work.
- If the slowness is not due to a disabled L2 cache (how to test this
properly btw?), can the problems be solved by tying with the mtrr or microcode update code?
- Maybe the problem is still hardware related, like the on-board
voltage regulator for the CPU is not working properly, even if there are no indications at all from the on board sensors. However, if the problems are software related and can be solved, do you think it is feasible to replace the AMI BIOS with LinuxBIOS? The probability of getting an updated BIOS from MSI supporting Coppermine and Tualatin processors is probably zero.
Thanks, Svante
I still do get a segfault when trying to activate the L2 cache, in the cache_enable() inline assembly routine in l2_cache.c Anything else neeeded to run this program inside GNU/Linux
On Fri, 2003-11-07 at 15:42, steven james wrote:
Greetings,
Yes, anything non 0 is true. Testing that way (or if(res<0) when the function is to return a count) generally helps to catch wierdness (in the bad old days, some functions returned -errno or even errno on error but always 0 on success, this catches all of those cases).
G'day, sjames
On Thu, 2003-11-06 at 14:59, steven james wrote:
Greetings,
To run that code inside linux, you need to add a call to iopl to
allow
direct hardware access like:
res = iopl(3); if(res) { report_error(); exit(-1); }
or something to that effect. G'day, sjames
On Fri, 7 Nov 2003, Svante Signell wrote:
Steven,
Thanks for the tip, I'll try adding this in. Preliminary estimations with lmbench-2.0 shows like the problems are probably due to the missing L2 cache. I'm currently compiling and running running lmbench-3, but with an efficient speed of 7MHz instead of 1300MHz, things take time...
...
- If the slowness is not due to a disabled L2 cache (how to test this
properly btw?), can the problems be solved by tying with the mtrr or microcode update code?
- Maybe the problem is still hardware related, like the on-board
voltage regulator for the CPU is not working properly, even if there are no indications at all from the on board sensors. However, if the problems are software related and can be solved, do you think it is feasible to replace the AMI BIOS with LinuxBIOS? The probability of getting an updated BIOS from MSI supporting Coppermine and Tualatin processors is probably zero.
Thanks, Svante
On Thu, 6 Nov 2003, Svante Signell wrote:
Sorry for taking up this thread again but now I have made a test of the l2_cache activation code and have some further questions.
you don't need this code any more. The last processor it mattered for is long dead. I am not removing it but if you are having trouble then you have a PII; do you?
- You state that the L2 cache stuff is only needed for P2 CPUs, not for
P3 type CPUs, such as Coppermine or Tualatin. I'm testing with a Celeron 2 CPU (Tualatin), which is of P3 type. What if the BIOS does not recognise the CPU and disables the L2 cache? People claim that AMI BIOSes work this way. It the enabling code sufficient to make things work.
which BIOS?
- If the slowness is not due to a disabled L2 cache (how to test this
properly btw?), can the problems be solved by tying with the mtrr or microcode update code?
use lmbench to scope out your caches.
ron
On Thu, 2003-11-06 at 15:11, ron minnich wrote:
On Thu, 6 Nov 2003, Svante Signell wrote:
Sorry for taking up this thread again but now I have made a test of the l2_cache activation code and have some further questions.
you don't need this code any more. The last processor it mattered for is long dead. I am not removing it but if you are having trouble then you have a PII; do you?
No, I have a Pentium 2 1.3GHz Tualatin processor that works properly on other 440BX-based main boards such as QDI BrillianX 1 and Compaq Presario 5670. I'm using a slot 1 to socket 370 converter, SLOT-T from Upgradeware for this CPU.
BTW: What do you mean by PIIs are long dead? I think _many_ people are still using PIIs, especially with 400BX-based main boards. What I have done is to extend the life of my old computers and their main boards by exchanging the PIIs with PIIIs, specifically using Celeron2, with 1.4 GHz frequency (or 1.3GHz, since 1.4GHz versions are hard to find currently).
Intel is phasing out these Celeron2s today, but compatible CPUs with decent performance and clock frequencies are becoming available, e.g. the VIA C3 Nehemiah, today at 1.2GHz. Versions up to 2GHz are coming soon. Even SMP capable processors are in this years roadmap. Now we are talking low power and cheap solutions (and maybe even fan-less). I plan to use SMP-able processors from VIA for my currently problematic motherboard when available. The first step is to make things run with one CPU.
The motherboard is a dual CPU MSI-6120 with built-in SCSI interfaces. It is currently equipped with two SMP-able PII-type Celerons (Mendocino) 300MHz@468MHz. This motherboard have been working properly for four years now and it would (in my opinion) be a big waste to throw it away. I even consider to replace the problematic BIOS with a LinuxBIOS if the problem is found not to be hardware related.
- You state that the L2 cache stuff is only needed for P2 CPUs, not for
P3 type CPUs, such as Coppermine or Tualatin. I'm testing with a Celeron 2 CPU (Tualatin), which is of P3 type. What if the BIOS does not recognise the CPU and disables the L2 cache? People claim that AMI BIOSes work this way. It the enabling code sufficient to make things work.
which BIOS?
The BIOS is from AMI with version 2.0 (a6120v20.exe) supplied by MSI.
- If the slowness is not due to a disabled L2 cache (how to test this
properly btw?), can the problems be solved by tying with the mtrr or microcode update code?
use lmbench to scope out your caches.
OK, I have now run lmbench on two boxes, one QDI box with a 1.4GHz Celeron2 and the MSI-6120 with a 1.3GHz Celeron2, both with SLOT-T adapters. How to find out where the bottleneck is, especially if the L2 cache is enabled or not? A lot of data is written on the log files. I'm currently trying to find out what all numbers mean. One quick observation though, the correctly working box is reported as: MHZ: 1409 MHz, 0.71 nanosec clock, while the problematic one is reported as: MHZ: 7 MHz, 142.86 nanosec clock. A factor of 200 in slowdown, welcome back to the old 8086 clock speeds ;-)
ron
Thanks, Svante
Those lm bench memory tests with the plots of memory access times will show you l1, l2, and memory boundaries.
ron
Hi,
I have now run the lmbench3-0-a3 tests. For the correctly working 1.4 GHz Tualatin CPU the latency numbers shows jumps from 2ns to 6ns at 16k array size and from 6ns to 120ns at 265k array size. I assume this indicates correctly working level 1 and 2 caches.
For the erroneous motherboard with a 1.3GHz Tualatin CPU the numbers are around 400ns independent of array size. The only thing changig is that the latency numbers increase to 440-460ns for large values of the stride. My interpretation is that not even the L1 cache is working properly. All other tests indicate a _very_ slow CPU, around 7MHz is measured by lmbench (BTW how good is this value?) compared to the expected 1.3GHz. Two questions immediately arise.
1. Is this slowness reasonable if _no- caches are working properly? 2. If there is a problem with the on-chip voltage regulator and the CPU clock speed is really 7MHz, as measured by lmbench, can the CPU operate properly at this low speed. I thought there was a _lower_ limit as well as an upper limit for the operating frequency?
Svante
On Fri, 2003-11-07 at 06:04, ron minnich wrote:
Those lm bench memory tests with the plots of memory access times will show you l1, l2, and memory boundaries.
ron
Linuxbios mailing list Linuxbios@clustermatic.org http://www.clustermatic.org/mailman/listinfo/linuxbios
On Fri, Nov 14, 2003 at 08:26:55AM +0100, Svante Signell wrote:
For the erroneous motherboard with a 1.3GHz Tualatin CPU the numbers are around 400ns independent of array size. The only thing changig is that the latency numbers increase to 440-460ns for large values of the stride. My interpretation is that not even the L1 cache is working properly. All other tests indicate a _very_ slow CPU, around 7MHz is measured by lmbench (BTW how good is this value?) compared to the expected 1.3GHz. Two questions immediately arise.
- Is this slowness reasonable if _no- caches are working properly?
- If there is a problem with the on-chip voltage regulator and the CPU
clock speed is really 7MHz, as measured by lmbench, can the CPU operate properly at this low speed. I thought there was a _lower_ limit as well as an upper limit for the operating frequency?
What do these commands say?
cat /proc/mtrr cat /proc/cpuinfo
On Fri, 2003-11-14 at 08:41, Takeshi Sone wrote:
On Fri, Nov 14, 2003 at 08:26:55AM +0100, Svante Signell wrote:
For the erroneous motherboard with a 1.3GHz Tualatin CPU the numbers are around 400ns independent of array size. The only thing changig is that the latency numbers increase to 440-460ns for large values of the stride. My interpretation is that not even the L1 cache is working properly. All other tests indicate a _very_ slow CPU, around 7MHz is measured by lmbench (BTW how good is this value?) compared to the expected 1.3GHz. Two questions immediately arise.
- Is this slowness reasonable if _no- caches are working properly?
- If there is a problem with the on-chip voltage regulator and the CPU
clock speed is really 7MHz, as measured by lmbench, can the CPU operate properly at this low speed. I thought there was a _lower_ limit as well as an upper limit for the operating frequency?
What do these commands say?
cat /proc/mtrr cat /proc/cpuinfo
Normal output: 1.4GHz Tualatin cat/proc/mtrr: reg00: base=0x00000000 ( 0MB), size= 256MB: write-back, count=1 reg01: base=0xe4000000 (3648MB), size= 8MB: write-combining, count=1
# Faulty system: cat /proc/mtrr cat: /proc/mtrr: No such file or directory
Faulty system: $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 11 model name : Intel(R) Celeron(TM) CPU 1300MHz stepping : 4 cpu MHz : 1340.197 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 2641.10
On Fri, Nov 14, 2003 at 10:46:00PM +0100, Svante Signell wrote:
# Faulty system: cat /proc/mtrr cat: /proc/mtrr: No such file or directory
I guess the BIOS does not initialize the MTRR, and all RAM is uncached. (MTRR is the registers that tell CPU where to cache)
On Sun, 16 Nov 2003, Takeshi Sone wrote:
On Fri, Nov 14, 2003 at 10:46:00PM +0100, Svante Signell wrote:
# Faulty system: cat /proc/mtrr cat: /proc/mtrr: No such file or directory
I guess the BIOS does not initialize the MTRR, and all RAM is uncached. (MTRR is the registers that tell CPU where to cache)
no, even if bios does not set mtrr, those registers exist, and are readable. Something weird is going on here!
ron is going on here.
I did boot another kernel and for that kernel there was one entry for mtrr, so this seems to work. However, now I have tried executing both the mtrr and cache activation code, and when coming to any inline assembly code the program exits with a segfault :( All commented out calls have been tried one after the other by single-stepping with gdb.
Below is the main progam I used:
#include <mem.h> main() { struct mem_range mem; int res = iopl(3); if(res) {error();exit(-1);} // cache_enable(); // p6_configure_l2_cache(); cache_on(mem); }
On Sat, 2003-11-15 at 20:54, ron minnich wrote:
On Sun, 16 Nov 2003, Takeshi Sone wrote:
On Fri, Nov 14, 2003 at 10:46:00PM +0100, Svante Signell wrote:
# Faulty system: cat /proc/mtrr cat: /proc/mtrr: No such file or directory
I guess the BIOS does not initialize the MTRR, and all RAM is uncached. (MTRR is the registers that tell CPU where to cache)
no, even if bios does not set mtrr, those registers exist, and are readable. Something weird is going on here!
ron is going on here.
Linuxbios mailing list Linuxbios@clustermatic.org http://www.clustermatic.org/mailman/listinfo/linuxbios
Do I need to use gcc-2.95.x instead of gcc-3.3.2 to make the inline assembly run OK? Or is there something about 16bit mode versus 32bit mode?
On Sun, 2003-11-16 at 11:14, Svante Signell wrote:
I did boot another kernel and for that kernel there was one entry for mtrr, so this seems to work. However, now I have tried executing both the mtrr and cache activation code, and when coming to any inline assembly code the program exits with a segfault :( All commented out calls have been tried one after the other by single-stepping with gdb.
Below is the main progam I used:
#include <mem.h> main() { struct mem_range mem; int res = iopl(3); if(res) {error();exit(-1);} // cache_enable(); // p6_configure_l2_cache(); cache_on(mem); }
On Sat, 2003-11-15 at 20:54, ron minnich wrote:
On Sun, 16 Nov 2003, Takeshi Sone wrote:
On Fri, Nov 14, 2003 at 10:46:00PM +0100, Svante Signell wrote:
# Faulty system: cat /proc/mtrr cat: /proc/mtrr: No such file or directory
I guess the BIOS does not initialize the MTRR, and all RAM is uncached. (MTRR is the registers that tell CPU where to cache)
no, even if bios does not set mtrr, those registers exist, and are readable. Something weird is going on here!
ron
On Tue, Nov 25, 2003 at 08:34:22PM +0100, Svante Signell wrote:
Do I need to use gcc-2.95.x instead of gcc-3.3.2 to make the inline assembly run OK? Or is there something about 16bit mode versus 32bit mode?
I bet it's not the compiler problem. I guess it's a kernel-mode vs. user-mode problem. I'm not very sure but some instructions like cache manupilation need to be run in kernel-mode, even if IOPL=3. If such instructions executed in user-mode, it results in segfault. So you have to run your cache activation code in the kernel, either put that code in the kernel itself or to make a kernel module to do it.
However, I don't know what exact code you need to enable the cache, given that MTRR is working..
On Sun, 2003-11-16 at 11:14, Svante Signell wrote:
I did boot another kernel and for that kernel there was one entry for mtrr, so this seems to work. However, now I have tried executing both the mtrr and cache activation code, and when coming to any inline assembly code the program exits with a segfault :( All commented out calls have been tried one after the other by single-stepping with gdb.
Below is the main progam I used:
#include <mem.h> main() { struct mem_range mem; int res = iopl(3); if(res) {error();exit(-1);} // cache_enable(); // p6_configure_l2_cache(); cache_on(mem); }
On Sat, 2003-11-15 at 20:54, ron minnich wrote:
On Sun, 16 Nov 2003, Takeshi Sone wrote:
On Fri, Nov 14, 2003 at 10:46:00PM +0100, Svante Signell wrote:
# Faulty system: cat /proc/mtrr cat: /proc/mtrr: No such file or directory
I guess the BIOS does not initialize the MTRR, and all RAM is uncached. (MTRR is the registers that tell CPU where to cache)
what I did to test this code was to build a kernel module for my linux, with this code inside, and insmod the kernel module.
ron
Ron and Takeshi,
Thanks for the tip. I'll try that next. Any pointers how to create a kernel module? So far I have only been writing code for user space.
On Tue, 2003-11-25 at 22:06, ron minnich wrote:
what I did to test this code was to build a kernel module for my linux, with this code inside, and insmod the kernel module.
ron
Linuxbios mailing list Linuxbios@clustermatic.org http://www.clustermatic.org/mailman/listinfo/linuxbios
I have now made a small kernel module based on l2_cache.c giving the following output:
Nov 30 17:46:56 cl-dual kernel: Configuring L2 cache...CPU signature of 6b0 so no L2 cache configuration Nov 30 17:46:56 cl-dual kernel: Enable Cache Nov 30 17:46:56 cl-dual kernel: done. Nov 30 17:46:56 cl-dual kernel: cache_on installed
No speed-up seen. Extremely slow as before. Any hints? mtrr is OK, I believe. Is it the microcode??
The processor is an 1.3GHz Celeron Tualatin, with CPUID: 6b0. According to the code in l2_cache.c newer CPUs than Coppermine (680) does not need the L2 setup code. Is this the case? if (signature < 0x630 || signature >= 0x680) { printk_debug("CPU signature of %x so no L2 cache configuration\n", signature); goto done;
I few questions: 1. Does a kernel module have to be a standalone object without linking stage? 2. How to add libraries to link with, if unresolved externals show up. 3. How to create a kernel module consisting of more than one object file. Now I include the needed source files into the main one. 4. The cflags used are: CFLAGS = -D__KERNEL__ -DMODULE -I ./include -I$/usr/src/linux/include -I /usr/src/kernel-headers-2.4.22-1 -O2 -Wall -g 5. My module code looks like: cat cache_on.c
#include <linux/module.h> #include <linux/kernel.h> (#include source files and other header files)
#define PFX " cache_on "
int init_module(void) { p6_configure_l2_cache(); printk(KERN_INFO PFX " installed \n"); return 0; }
void cleanup_module(void) { printk(KERN_INFO PFX " removed\n"); }
On Tue, 2003-11-25 at 22:56, Svante Signell wrote:
Ron and Takeshi,
Thanks for the tip. I'll try that next. Any pointers how to create a kernel module? So far I have only been writing code for user space.
On Tue, 2003-11-25 at 22:06, ron minnich wrote:
what I did to test this code was to build a kernel module for my linux, with this code inside, and insmod the kernel module.
ron
Linuxbios mailing list Linuxbios@clustermatic.org http://www.clustermatic.org/mailman/listinfo/linuxbios
Linuxbios mailing list Linuxbios@clustermatic.org http://www.clustermatic.org/mailman/listinfo/linuxbios
Hi Svante,
----- Original Message ----- From: "Svante Signell" svante.signell@telia.com To: "ron minnich" rminnich@lanl.gov Cc: "Takeshi Sone" ts1@tsn.or.jp; linuxbios@clustermatic.org Sent: Monday, December 01, 2003 10:14 AM Subject: Re: Level 2 cache activation code?
I have now made a small kernel module based on l2_cache.c giving the following output:
Nov 30 17:46:56 cl-dual kernel: Configuring L2 cache...CPU signature of 6b0 so no L2 cache configuration Nov 30 17:46:56 cl-dual kernel: Enable Cache Nov 30 17:46:56 cl-dual kernel: done. Nov 30 17:46:56 cl-dual kernel: cache_on installed
No speed-up seen. Extremely slow as before. Any hints? mtrr is OK, I believe. Is it the microcode??
The processor is an 1.3GHz Celeron Tualatin, with CPUID: 6b0. According to the code in l2_cache.c newer CPUs than Coppermine (680) does not need the L2 setup code. Is this the case?
This was based on the assumption that all CPU from the coppermine forward had the cache integrated onto the CPU die. Is this the case with your CPU. Is it just a single large CPU on the slot1 pcb or does there look to be cache chips mounted on the board as well?
if (signature < 0x630 || signature >= 0x680) { printk_debug("CPU signature of %x so no L2 cache configuration\n", signature); goto done;
You could always just drop this test and see what happens later. If the CPU does have external cache chips then this code might just work in initiallising the cache.
I few questions:
- Does a kernel module have to be a standalone object without linking
stage?
Yes just an object file compiled with the correct module flags.
- How to add libraries to link with, if unresolved externals show up.
They should not. It is possible to add libraries but generally it is simpler to keep all of the module code in the one file. Modules can reference other modules if required. The depmod program will then load all required modules.
- How to create a kernel module consisting of more than one object
file. Now I include the needed source files into the main one.
Generally this is the easiest way to test this.
- The cflags used are:
CFLAGS = -D__KERNEL__ -DMODULE -I ./include -I$/usr/src/linux/include -I /usr/src/kernel-headers-2.4.22-1 -O2 -Wall -g
Have a look at a normal kernel compile and compare these arguments against what is used for the other modules.
- My module code looks like:
cat cache_on.c
#include <linux/module.h> #include <linux/kernel.h> (#include source files and other header files)
#define PFX " cache_on "
int init_module(void) { p6_configure_l2_cache(); printk(KERN_INFO PFX " installed \n"); return 0; }
void cleanup_module(void) { printk(KERN_INFO PFX " removed\n"); }
Looks fine. Turn on as much debugging in the l2_cache code as possible and post to me and I will decode. Need to be able to see all of the printk_debug messages.
Regards, Denis
On Mon, 2003-12-01 at 00:47, Denis Dowling wrote:
Hi Svante,
----- Original Message ----- From: "Svante Signell" svante.signell@telia.com To: "ron minnich" rminnich@lanl.gov Cc: "Takeshi Sone" ts1@tsn.or.jp; linuxbios@clustermatic.org Sent: Monday, December 01, 2003 10:14 AM Subject: Re: Level 2 cache activation code?
The processor is an 1.3GHz Celeron Tualatin, with CPUID: 6b0. According to the code in l2_cache.c newer CPUs than Coppermine (680) does not need the L2 setup code. Is this the case?
This was based on the assumption that all CPU from the coppermine forward had the cache integrated onto the CPU die. Is this the case with your CPU. Is it just a single large CPU on the slot1 pcb or does there look to be cache chips mounted on the board as well?
The CPU is placed on a socket 370 to slot 1 adapter (SLOT-T) from Upgradeware. No, there are no external cache chips on the MOBO. BTW, the MOBO is a dual CPU 82443BX board (MSI-6120). It runs perfectly well with dual Celerons (Mendocino). Also, the CPU placed on the SLOT-T adapter works perfectly well with other (single CPU) boards.
if (signature < 0x630 || signature >= 0x680) { printk_debug("CPU signature of %x so no L2 cache configuration\n", signature); goto done;
You could always just drop this test and see what happens later. If the CPU does have external cache chips then this code might just work in initiallising the cache.
I have disabled the test and it seems the cache activation seem to work, see below. The slowness remains however :-(
Dec 4 14:39:56 cl-dual kernel: Configuring L2 cache...Disable Cache Dec 4 14:39:56 cl-dual kernel: rdmsr(0x17) = 0, 84320000 Dec 4 14:39:56 cl-dual kernel: L2 Cache latency is 1 Dec 4 14:39:56 cl-dual kernel: Sending 0 to set_l2_register4 Dec 4 14:39:56 cl-dual kernel: L2 ECC Checking is enabled Dec 4 14:39:56 cl-dual kernel: L2 Physical Address Range is 512M Dec 4 14:39:56 cl-dual kernel: Maximum cache mask is 2000 Dec 4 14:39:56 cl-dual kernel: L2 Cache Mask is 0 Dec 4 14:39:56 cl-dual kernel: read_l2(2) = 0 Dec 4 14:39:56 cl-dual kernel: write_l2(2) = 0 Dec 4 14:39:56 cl-dual kernel: Enable Cache Dec 4 14:39:56 cl-dual kernel: L2 Cache size is 256K Dec 4 14:39:56 cl-dual kernel: L2 Cache lines initialized Dec 4 14:39:56 cl-dual kernel: Disable Cache Dec 4 14:39:56 cl-dual kernel: Enable Cache Dec 4 14:39:56 cl-dual kernel: done. Dec 4 14:39:56 cl-dual kernel: cache_on installed
Looks fine. Turn on as much debugging in the l2_cache code as possible and post to me and I will decode. Need to be able to see all of the printk_debug messages.
Regards, Denis
What is wrong here: Not caches Not mtrr microcode?? anything else?? HW fault, i.e. the VRM does not work as expected, even though lm-sensors are reporting correct voltages. The BIOS is not supporting Coppermine and later CPUs. AMI BIOS V2.0 from (MSI)
Soon giving up...
On Mon, Dec 01, 2003 at 12:14:14AM +0100, Svante Signell wrote:
No speed-up seen. Extremely slow as before. Any hints? mtrr is OK, I believe. Is it the microcode??
You could try the microcode driver of Linux or code from LinuxBIOS.
- How to create a kernel module consisting of more than one object
file. Now I include the needed source files into the main one.
ld -o module.o -r obj1.o obj2.o
On Fri, 14 Nov 2003, Svante Signell wrote:
I have now run the lmbench3-0-a3 tests. For the correctly working 1.4 GHz Tualatin CPU the latency numbers shows jumps from 2ns to 6ns at 16k array size and from 6ns to 120ns at 265k array size. I assume this indicates correctly working level 1 and 2 caches.
yes.
For the erroneous motherboard with a 1.3GHz Tualatin CPU the numbers are around 400ns independent of array size. The only thing changig is that the latency numbers increase to 440-460ns for large values of the stride. My interpretation is that not even the L1 cache is working properly. All other tests indicate a _very_ slow CPU, around 7MHz is measured by lmbench (BTW how good is this value?) compared to the expected 1.3GHz. Two questions immediately arise.
weird. I have no idea what's going on here. Something is really wrong.
ron
ron minnich wrote:
On Fri, 14 Nov 2003, Svante Signell wrote:
I have now run the lmbench3-0-a3 tests. For the correctly working 1.4 GHz Tualatin CPU the latency numbers shows jumps from 2ns to 6ns at 16k array size and from 6ns to 120ns at 265k array size. I assume this indicates correctly working level 1 and 2 caches.
yes.
For the erroneous motherboard with a 1.3GHz Tualatin CPU the numbers are around 400ns independent of array size. The only thing changig is that the latency numbers increase to 440-460ns for large values of the stride. My interpretation is that not even the L1 cache is working properly. All other tests indicate a _very_ slow CPU, around 7MHz is measured by lmbench (BTW how good is this value?) compared to the expected 1.3GHz. Two questions immediately arise.
weird. I have no idea what's going on here. Something is really wrong.
I don't think I can comment with much precision here, but...
My early experience with disabled cache is that the system gets REALLY slow. PIII's (I think) will read a full cache line for every word it needs. That means that if you have a 32 byte cache line and read the entire line one 32 bit word at a time (8 accesses) the PIII will read that entire cache line 8 times, one for each word access. This may apply only to code fetches.
It gets really rediculous when this is happening while executing code over the ISA bus (from ROM).
Cheers! Ty