Eric,
I have checked the latest code again for Opteron MB. The MB still needs to reset three times to get it done.
1. Hyper transport scan link: 0 max: 1 PCI: 01:01.0 [1022/7450] enabled next_unitid: 0003 PCI: 01:03.0 [1022/7460] enabled next_unitid: 0007 HyperT reset needed 2. Initializing devices... PCI: 00:18.3 init NB: Function 3 Misc Control.. resetting cpu 3. Initializing devices... PCI: 00:18.3 init NB: Function 3 Misc Control.. done. PCI: 00:19.3 init NB: Function 3 Misc Control.. resetting cpu
Can we only reset the MB only time to make the HT work at the need speed/width?
I mean only enable the reset in PCI: 00:19.3 init NB: Function 3 Misc Control.. resetting cpu
Regards
YH
* YhLu YhLu@tyan.com [031118 22:50]:
I have checked the latest code again for Opteron MB. The MB still needs to reset three times to get it done.
Can we only reset the MB only time to make the HT work at the need speed/width?
It should be enough to set a global variable "reset_needed" (maybe even in CMOS) and check this, probably after all device drivers have been executed and thus had the chance to set that flag.
I mean only enable the reset in PCI: 00:19.3 init NB: Function 3 Misc Control.. resetting cpu
When changing the parameters on Solo I always got it to one of the reboots - My best result was to get it to the Misc Control reset before it would hang hard.
Stefan
Stefan Reinauer stepan@suse.de writes:
- YhLu YhLu@tyan.com [031118 22:50]:
I have checked the latest code again for Opteron MB. The MB still needs to reset three times to get it done.
Can we only reset the MB only time to make the HT work at the need speed/width?
It should be enough to set a global variable "reset_needed" (maybe even in CMOS) and check this, probably after all device drivers have been executed and thus had the chance to set that flag.
I mean only enable the reset in PCI: 00:19.3 init NB: Function 3 Misc Control.. resetting cpu
When changing the parameters on Solo I always got it to one of the reboots - My best result was to get it to the Misc Control reset before it would hang hard.
Reducing the number of reboots definitely needs to happen. I only coded the way it is currently because that generates obviously correct code.
Right now I am looking at how to reduce problems when a board has a noisy smbus. So far I have not seen a single board without one. In the noisy smbus the more traffic you have the more chances there are you will have problems because of it.
So it might make sense to set everything up in a very early pass before memory reset. Reset the system, and then let the existing resets will not trigger.
Either that or Stefans delayed scheme. At this point I am not certain which will be easier to maintain. If we do implement the delayed reset I want to move the memory clear code up into the generic framework so we don't have to clear memory twice. But if we can get the resets over with before we initialize memory we are in better shape. That plus something like the kernels quirk interface to handle the various know bits of buggy hardware and we should be ok.
Eric
* Eric W. Biederman ebiederman@lnxi.com [031119 00:16]:
So it might make sense to set everything up in a very early pass before memory reset. Reset the system, and then let the existing resets will not trigger.
Either that or Stefans delayed scheme. At this point I am not certain which will be easier to maintain. If we do implement the delayed reset I want to move the memory clear code up into the generic framework so we don't have to clear memory twice. But if we can get the resets over with before we initialize memory we are in better shape. That plus something like the kernels quirk interface to handle the various know bits of buggy hardware and we should be ok.
Would cache-as-ram be an alternative for AMD64 cpus to move the pretty complex ht code to a point as early as possible? Romcc really does a great job, but IIRC the current ht speed code is where it is now because it is really hard to do before there is memory.
I also suspect AMD doing a better job in keeping cache as ram initialization code the same over different cpu steppings than Intel managed with their P-IV.
Stefan
I think we should ask for a working cache-as-ram solution for both future K8 chips and for the (rumored) K9
ron
ron minnich rminnich@lanl.gov writes:
I think we should ask for a working cache-as-ram solution for both future K8 chips and for the (rumored) K9
Go right ahead. If you need leverage Intel has added a PAL opcode for it to the Itanium. And the latest Itaniums support it. And the PEI stage of EFI will require it or something equivalent.
Eric
Stefan Reinauer stepan@suse.de writes:
- Eric W. Biederman ebiederman@lnxi.com [031119 00:16]:
So it might make sense to set everything up in a very early pass before memory reset. Reset the system, and then let the existing resets will not trigger.
Either that or Stefans delayed scheme. At this point I am not certain which will be easier to maintain. If we do implement the delayed reset I want to move the memory clear code up into the generic framework so we don't have to clear memory twice. But if we can get the resets over with before we initialize memory we are in better shape. That plus something like the kernels quirk interface to handle the various know bits of buggy hardware and we should be ok.
Would cache-as-ram be an alternative for AMD64 cpus to move the pretty complex ht code to a point as early as possible?
If I can get a commit from a cpu vendor to support it.
Romcc really does a great job, but IIRC the current ht speed code is where it is now because it is really hard to do before there is memory.
In part at the time I was just using the 8 general purpose registers. So we have a little more room. It is also where it is because that is a very good general purpose place to put it. I have found an instruction that will allow me to extract or insert a bit field into one of the xmm registers. Which may also help.
I also suspect AMD doing a better job in keeping cache as ram initialization code the same over different cpu steppings than Intel managed with their P-IV.
Well I wrote that code not Intel. Which was essentially the problem.
It is a support nightmare to have someone plug in a new processor and have it totally fail because the cache acts differently.
I have never tried hard on the AMD processors but the limited attempt I made had problems because of the additional complexity of TOP_MEM and the io range registers.
I like the idea. I just rest easier at night with romcc because I know so new thing won't cause me to rewrite everything.
Eric