Marc Jones marcj303 at gmail.com
Mon Nov 30 23:17:20 CET 2009

On Fri, Nov 27, 2009 at 2:05 AM, Nathan Williams <nathan at traverse.com.au> wrote:
> Nathan Williams wrote:
>> Marc Jones wrote:
>>> On Tue, Nov 24, 2009 at 1:09 AM, Nathan Williams <nathan at traverse.com.au> wrote:
>>>> Marc Jones wrote:
>>>>> On Mon, Nov 23, 2009 at 12:27 AM, Nathan Williams
>>>>> <nathan at traverse.com.au> wrote:
>>>>>> I managed to get the commercial BIOS to boot on my board and diffed it with coreboot:
>>>>>> http://coreboot.pastebin.com/m39b22c21
>>>>>> The only differences I can see are related to interrupts, which shouldn't matter in relation to
>>>>>> my RAM problems.
>>>>>> I have also run a memtest86 with the commercial BIOS (from bootable CDROM) and as a payload in coreboot.
>>>>>> The commercial BIOS didn't have any errors, but my coreboot did.  So the hardware can't be too bad.
>>>>> That looks like just the southbridge cs5536 target. The memory
>>>>> differences would be in the processor geodelx target. Can you send
>>>>> those results?
>>>>> Marc
>>>> I did some new MSR dumps.
>>>> Diff:
>>>> ./msrtool -t geodelx -t cs5536 -d amd_ref_bios
>>>> http://coreboot.pastebin.com/m5e487f87
>>>> AMD NAS reference BIOS:
>>>> ./msrtool -t geodelx -t cs5536 -l -s amd_ref_bios
>>>> http://coreboot.pastebin.com/madc04ac
>>>> My Coreboot:
>>>> ./msrtool -t geodelx -t cs5536 -l -s nathan_bios
>>>> http://coreboot.pastebin.com/m7f35d855
>>>> The diffs I did today show some differences with GLCP_DELAY_CONTROLS.
>>>> Last time I added some code to force it to match the commercial BIOS
>>>> GLCP_DELAY_CONTROLS MSR, but it didn't seem to make any difference.
>>>> I also tested all the SODIMMS I have here (about 10) with the commercial BIOS.
>>>> Each time I did a msrtool diff to one I saved on disk.
>>>> Most are 333MHz, but 2 are 400MHz.  There weren't any changes to the MSRs.
>>>> Could there be an issue with the initialisation sequence that reading MSRs
>>>> after booting won't show?  Also, quite a few MSRs aren't defined in geodelx.c yet.
>>>> Are there any obvious ones that should be added in?
>>> --- AMD NAS reference BIOS
>>> +++ Nathan's coreboot v3
>>> #
>>> #
>>> -0x4c00000f 0x83f1_00aa_5696_0404
>>> +0x4c00000f 0x8271_005a_ 5696_ 0404
>>> It looks like coreboot and the ref bios detect different dimm
>>> configuration. This timing setup could be part of the instability (I
>>> don't think it explains the reset problem). Look at the code here:
>>> SetDelayControl(void) and anywhere else that GLCP_DELAY_CONTROLS gets
>>> set to see what might be happening. Make sure that MTest is disabled
>>> in the ref bios setup. This setting is based on the number of devices
>>> (load) there is on the dimm.
>>> I didn't realize that so few registers were in the msr tool for
>>> geodelx. You should add these:
>>> 20000018h R/W Refresh and SDRAM Program (MC_CF07_DATA)
>>> 10071007_00000040h Page 227
>>> 20000019h R/W Timing and Mode Program (MC_CF8F_DATA) 18000008_287337A3h Page 229
>>> 2000001Ah R/W Feature Enables (MC_CF1017_DATA) 00000000_11080001h Page 231
>>> 2000001Bh RO Performance Counters (MC_CFPERF_CNT1) 00000000_00000000h Page 232
>>> 2000001Ch R/W Counter and CAS Control (MC_PERCNT2) 00000000_00FF00FFh Page 233
>>> 2000001Dh R/W Clocking and Debug (MC_CFCLK_DBUG) 00000000_00001300h Page 233
>>> 4C00000Fh R/W GLCP I/O Delay
>>> Controls(GLCP_DELAY_CONTROLS)00000000_00000000h Page 549
>>> 4C000014h R/W GLCP System Reset and PLL Control (GLCP_SYS_RSTPLL)
>>> Bootstrap specific Page 554
>>> Marc
>> I've now added the MSRs and uploaded to pastebin:
>> http://coreboot.pastebin.com/m53aed60b
>> My coreboot:
>> http://coreboot.pastebin.com/md23bc6a
>> ./msrtool -d AMD_NAS:
>> http://coreboot.pastebin.com/m77663de5
>> Tomorrow I'll try the tests on the NAS hardware, instead of our own motherboards
>> just in case there are some hidden hardware issues.
>> Regards,
>> Nathan
> On the NAS reference board I got the following diff between coreboot
> and the commercial BIOS:
> http://coreboot.pastebin.com/m1353db1a
> As you can see there are a lot of latency differences.
> Unfortunately it was only later that I realised that the differences are because the bootstraps are set to bypass, which means coreboot uses 266 as the speed, where as the commercial bios uses 333.  So when I repeat the same on our boards, the only difference in the geodelx MSRs is:
> -0x2000001d 0x0000000000000000
> +0x2000001d 0x0000000000001000
> #    12 TRISTATE_DIS TRI-STATE Disable
> -0: Tri-stating enabled
> +1: Tri-stating disabled


I don't think the tri-state disable bit explains the problems you have
seen. Since the memory has the same settings, the problem must be
somewhere else. You will need to go back the the reboot path to
investigate. It seems like something in the reset isn't doing a
complete reset, which causes a problem with the cache disable.



