Ron,
Please check in the Tyan s2850/2880/2881/2882/4880 updates into the CVS Tree.
1. northbridge/amd/amdk8/raminit.h: change uint8_t to uint16_t 2. southbridge/amd/amd8111/amd8111_early_smbus.c: update smbus_write_byte 3. southbridge/amd/amd8131/amd8131_bridge.c: update ioapic_anable to enable PCI-X MASTER Mode. 4. other in /src/mainboard/tyan/ and /targets/tyan
Stefan, With update 1 and 2, you can get ride of FAKE_SPD_ROM. You need to change some lines in auto.c for quartet. 1. I2C HUB address: 0x30 --> 0x18, 2. RC0-> (1<<1)<<8, RC1-> (1<<2)<<8, RC2-> (1<<3)<<8, RC3-> (1<<4)<<8.
Regards
YH.
stefan and eric, we will look at these changes tomorrow. If you get a chance can you scan them too to make sure no obvious problems exist?
thanks
ron p.s. Ollie, Greg, and I will be combing linuxbios source this week, looking at issues and working on integrating the newest LNXI code in as well as this code.
ron minnich rminnich@lanl.gov writes:
stefan and eric, we will look at these changes tomorrow. If you get a chance can you scan them too to make sure no obvious problems exist?
thanks
ron p.s. Ollie, Greg, and I will be combing linuxbios source this week, looking at issues and working on integrating the newest LNXI code in as well as this code.
I currently have a very odd issue. I am getting hang ups with B3 stepping cpus with the latest version of the LinuxBIOS tree that I have.
I'm not certain what the problem is, but this will have create a delay in getting the last of my bug fixes out..
Eric
I committed all this except for one thing:
diff -uNr ./freebios2/src/southbridge/amd/amd8111/amd8111_early_smbus.c ../freebios2/src/southbridge/amd/amd8111/amd8111_early_smbus.c --- ./freebios2/src/southbridge/amd/amd8111/amd8111_early_smbus.c 2003-10-13 06:01:13.000000000 -0400 +++ ../freebios2/src/southbridge/amd/amd8111/amd8111_early_smbus.c 2003-12-01 16:14:18.000000000 -0500 @@ -120,24 +120,19 @@ return; }
- /* setup transaction */ - /* disable interrupts */ - outw(inw(SMBUS_IO_BASE + SMBGCTL) & ~((1<<10)|(1<<9)|(1<<8)|(1<<4)), - SMBUS_IO_BASE + SMBGCTL); +//By LYH Begin + outb(0x37,SMBUS_IO_BASE + SMBGSTATUS); /* set the device I'm talking too */ - outw(((device & 0x7f) << 1) | 1, SMBUS_IO_BASE + SMBHSTADDR); - outb(address & 0xFF, SMBUS_IO_BASE + SMBHSTCMD); - /* set up for a byte data write */ /* FIXME */ - outw((inw(SMBUS_IO_BASE + SMBGCTL) & ~7) | (0x1), SMBUS_IO_BASE + SMBGCTL); - /* clear any lingering errors, so the transaction will run */ - /* Do I need to write the bits to a 1 to clear an error? */ - outw(inw(SMBUS_IO_BASE + SMBGSTATUS), SMBUS_IO_BASE + SMBGSTATUS); + outw(((device & 0x7f) << 1) | 0, SMBUS_IO_BASE + SMBHSTADDR); + + /* data to send */ + outb(val, SMBUS_IO_BASE + SMBHSTDAT);
- /* clear the data word...*/ - outw(val, SMBUS_IO_BASE + SMBHSTDAT); + outb(address & 0xFF, SMBUS_IO_BASE + SMBHSTCMD);
/* start the command */ - outw((inw(SMBUS_IO_BASE + SMBGCTL) | (1 << 3)), SMBUS_IO_BASE + SMBGCTL); + outb(0xa, SMBUS_IO_BASE + SMBGCTL); +//By LYH END
/* poll for transaction completion */ smbus_wait_until_done();
I'm a little worried about removing some of those lines, such as:
/* disable interrupts */ outw(inw(SMBUS_IO_BASE + SMBGCTL) & ~((1<<10)|(1<<9)|(1<<8)|(1<<4)), SMBUS_IO_BASE + SMBGCTL);
does it hurt to leave this in? OR:
- /* clear any lingering errors, so the transaction will run */ - /* Do I need to write the bits to a 1 to clear an error? */ - outw(inw(SMBUS_IO_BASE + SMBGSTATUS), SMBUS_IO_BASE + SMBGSTATUS);
why remove this?
comments?
ron
ron minnich rminnich@lanl.gov writes:
I committed all this except for one thing:
This feels like something brought on by excess register pressure.
Ron one thing you did note was the changing of word accesses to byte accesses. With romcc that does not help in the case of register pressure. Usually doing the wrong size register accesses is a bad thing.
Eric
On 1 Dec 2003, Eric W. Biederman wrote:
Ron one thing you did note was the changing of word accesses to byte accesses. With romcc that does not help in the case of register pressure.
I would think it would hurt since x86 lets you use those little sub-registers (puddle arithmetic), so using bigger registers reduces the number of registers available.
ron
* ron minnich rminnich@lanl.gov [031202 17:13]:
On 1 Dec 2003, Eric W. Biederman wrote:
Ron one thing you did note was the changing of word accesses to byte accesses. With romcc that does not help in the case of register pressure.
I would think it would hurt since x86 lets you use those little sub-registers (puddle arithmetic), so using bigger registers reduces the number of registers available.
Yes, being able to use this from romcc would severely lower register pressure I assume. Neither romcc nor the code compiled with it takes care of this at the moment though.
Stefan
Stefan Reinauer stepan@suse.de writes:
- ron minnich rminnich@lanl.gov [031202 17:13]:
On 1 Dec 2003, Eric W. Biederman wrote:
Ron one thing you did note was the changing of word accesses to byte accesses. With romcc that does not help in the case of register pressure.
I would think it would hurt since x86 lets you use those little sub-registers (puddle arithmetic), so using bigger registers reduces the number of registers available.
Yes, being able to use this from romcc would severely lower register pressure I assume. Neither romcc nor the code compiled with it takes care of this at the moment though.
I tried this at one point. And the problem is that there is not a instruction sequence to move to/from the byte registers from a normal 32bit register. Which negates most of the benefit of the extra registers. 64bit mode on the Opteron gets byte register correct but it no longer has more than one byte register per general purpose register.
Getting in support for mmx and sse registers was much more beneficial. 16 more instead of just 4.
A more general purpose technique is to use bit-fields. I am close to having bit-fields implemented in my backburner version of romcc. I have some really odd ball ideas about bitfields in 128 bit sse registers :) But who knows when I will get that done.
Bit-fields still share with the x86 byte registers the property of increasing the register pressure when you modify their values or read/write them. (Because the field needs a register of it's own to be modified). But when they are just passed around they can nicely reduce the register pressure. And in addition they are under programmer control so you know it is a trade off between register pressure when using the value and register pressure when passing the values.
You can roll bit-fields by hand at the moment if you want though.
What I find most disturbing is last I looked is that size crt0.o list it at about 33K (After lowering spurious debugging messages from debug to spew). And linuxbios_payload.nrvb at about 24K. crt0.o from the p4dpr is at about 10K. So romcc is giving me a 3X code bloat... I am pretty certain it is code bloat caused by inlining everything.
Ron you complained earlier about compile speed and I think romcc is the big culprit there. It's register allocator is currently using a O(N^2) data structure, so the more code it compiles the slower it gets... I think I saw another version of basically the same algorithm that uses a different data structure, which would make it much faster.
Right now the speed is tolerable when I remember to set #define DEBUG_CONSISTENCY 1 instead of 2 which I committed accidently the other day. DEBUG_CONSISTENCY 2 is only really useful when debugging the register allocator. With a perfect compiler DEBUG_CONSISTENCY is not needed at all but romcc is still teething so if there is not a performance hit it is useful.
Eric
On 4 Dec 2003, Eric W. Biederman wrote:
Ron you complained earlier about compile speed and I think romcc is the big culprit there. It's register allocator is currently using a O(N^2) data structure, so the more code it compiles the slower it gets... I think I saw another version of basically the same algorithm that uses a different data structure, which would make it much faster.
I don't mind the romcc compile speed at all, I just don't want to see us return to the days of tons of perl code in the Makefiles. We made deliberate changes to the new config tool to eliminate the need for that stuff. When I first cut over to the new config tool I was quite surprised at the difference in build time for linuxbios -- all those Perl invocations are expensive.
Romcc is fine by me, I don't care how fast it goes, it saves our necks.
ron
ron minnich wrote:
I just don't want to see us return to the days of tons of perl code in the Makefiles. We made deliberate changes to the new config tool to eliminate the need for that stuff. When I first cut over to the new config tool I was quite surprised at the difference in build time for linuxbios -- all those Perl invocations are expensive.
You realize you are talking about God's Language :-)). Anyway, I think as many sins could just as easily be committed in bash, I wouldn't necessarily hang it all on Perl. IMHO.
-Steve
On Fri, 5 Dec 2003, Steve Gehlbach wrote:
ron minnich wrote:
I just don't want to see us return to the days of tons of perl code in the Makefiles. We made deliberate changes to the new config tool to eliminate the need for that stuff. When I first cut over to the new config tool I was quite surprised at the difference in build time for linuxbios -- all those Perl invocations are expensive.
You realize you are talking about God's Language :-)). Anyway, I think as many sins could just as easily be committed in bash, I wouldn't necessarily hang it all on Perl. IMHO.
yes, we want to keep the bash invocations to a minimum too.
ron
On Thu, Dec 04, 2003 at 11:22:08PM -0700, Eric W. Biederman wrote:
I would think it would hurt since x86 lets you use those little sub-registers (puddle arithmetic), so using bigger registers reduces the number of registers available.
Yes, being able to use this from romcc would severely lower register pressure I assume. Neither romcc nor the code compiled with it takes care of this at the moment though.
I tried this at one point. And the problem is that there is not a instruction sequence to move to/from the byte registers from a normal 32bit register.
Hmm, there's movzx and movsx for moving to 32 bit, but from 32 to 8 is worse for esi, edi and ebp. 32->16 works fine of course.
I could be missing something though. :)
//Peter
YhLu YhLu@tyan.com writes:
Ron,
Please check in the Tyan s2850/2880/2881/2882/4880 updates into the CVS Tree.
- northbridge/amd/amdk8/raminit.h: change uint8_t to uint16_t
This is quite reasonable.
- southbridge/amd/amd8111/amd8111_early_smbus.c: update smbus_write_byte
- southbridge/amd/amd8131/amd8131_bridge.c: update ioapic_anable to enable
PCI-X MASTER Mode.
Why is this necessary? Previously it has been the policy to not set the master enables any more than is necessary for proper operation of the devices. Unless I am mistaken not setting the master enables is very much in spec for pci devices.
- other in /src/mainboard/tyan/ and /targets/tyan
Stefan, With update 1 and 2, you can get ride of FAKE_SPD_ROM. You need to change some lines in auto.c for quartet. 1. I2C HUB address: 0x30 --> 0x18, 2. RC0-> (1<<1)<<8, RC1-> (1<<2)<<8, RC2-> (1<<3)<<8, RC3-> (1<<4)<<8.
Looking at the diffstat output you also touched the quartet/Config.lb and removed the FAKE_SPD define. Has this been tested?
There is a context diff included in tyan/s2881/lyh.txt that does not look like it needs to be there.
And of course there were all kinds of binary roms that the diff did not include.
Eric
* YhLu YhLu@tyan.com [031202 02:05]:
With update 1 and 2, you can get ride of FAKE_SPD_ROM. You need to change some lines in auto.c for quartet.
- I2C HUB address: 0x30 --> 0x18,
- RC0-> (1<<1)<<8, RC1-> (1<<2)<<8, RC2-> (1<<3)<<8, RC3-> (1<<4)<<8.
Tried this, doesn't work. I end up with 0KB memory detected on each CPU. Is it only my Quartets that behave like this?
Stefan