Re: [coreboot] serengeti_cheetah_fam10: Erratas triple fault in SimNow

20 Mar 2008


      Hi Bernhard,
As you can see we are still working on the Barcelona code. There is a 
lot to debug. I have not tested it against SimNow, only real hardware. I 
will work on adding SimNow to our(my) testing. The public version may 
not have some of these MSR implemented. I will have to look into it 
more. I put some comments inline -
Bernhard Kaindl wrote:
...
Hi,
   Are you saying SimNow itself segfaults on you or is it coreboot
which triple-faults inside SimNow?
Maybe this is something different:
I recently investigated why
src/mainboard/amd/serengeti_cheetah_fam10/cache_as_ram_auto.c
causes triple-faults inside the publically available SimNow
(the non-NDA version) and here is one of the causes:
    /* FIXME: Check CPU revision to apply correct erratas */
    /* Rev B errata */
    /* Errata #169 - supercedes errata #131 */
    msr = rdmsr(0xC001001F);
    msr.hi |= 1 << (32 - 32);
    wrmsr(0xC001101F, msr);


This set a different bit in a different MSR as it is indicated
in Errata #169. To apply the Errata as indicated in the public
document, a change like this is needed:
    msr = rdmsr(0xC001001F);


  msr.hi |= 1 << (32 - 32);


  msr.hi |= 1 << 32;


  wrmsr(0xC001101F, msr);


  wrmsr(0xC001001F, msr);


msr.hi is bits 63:32, msr.lo is 31:0. Your shift of 32 pushes the bit 
off the end so 0 is being or'd onto msr.hi.
...
The current code reads the correct MSR, sets a different bit
(bit 0 instead of 32), and write the changed value to a private,
undocumented or even non-existing MSR, or maybe it's a typo.
Sadly, bit 32 of 0xC001001F is also undocumented AFAICS, but
Errata #169 says that it should be set. However, that errata
was later updated to suggest that also as register in the north
bridge must be changed and I didn't find that part of the errata
in coreboot yet.
So, erratta #169 has been removed from the latest document so it will be 
coming out of the code anyway.
...
With that change (I guess it's a fix) SimNow executes this code
but triples on the next errata implementation:
    /* Errata #202 [DIS_PIGGY_BACK_SCRUB]=1 */
    msr = rdmsr(0xC0011022);
    msr.hi |= 1 << 24;
    wrmsr(0xC0010022, msr);


Again, this applies the changed MSR value to a different MSR
which is also undocumented or even non-existing(or typo). I also did
not manage to find any information in Errata #202, so I guess
it applies to AMD engineering samples only?
I have no suggestion on how to fix that part as I could not
find any documentaiton on it.
...
This errata has also been removed so we will remove it from coreboot.
...
I also think that applying erratas which are not essental to have
in the very earlyest boot stage should not neccesarily reside
inside the mainboard-specific cache_as_ram_auto.c but moved to
a place in the compessed coreboot code where different boards
can share errata implementations for the CPUs which they suppport.
This would be ideal if it could be int he compressed code but it is very 
difficult to tell if an errata will be hit in early in initialization. 
Also, many errata require the soft reset to take effect. As the comments 
note I really didn't want that code in cache_as_ram_auto.c and I am 
working on moving it to the generic CPU code.
...
When everyhing is set up, exceptions from wrmsr could also be
handled better (I guess) than causing triple faults. Linux has
wrmsr() functions with exception handling in include/asm-x86/msr.h
which give a proper return code and do not crash the code.
There really isn't any exception handling in coreboot but it would be 
something to consider.
...
netbsd has a very nice structure for that in place in which you
can enter erratas simply by adding an entry in a table in which
you specify for which CPU which errata shall be applied:
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/x86/x86/errata.c?rev=1.13&a...
Yes, the new code I am working is table and revision driven.
Thanks,
Marc
...
Bernhard
On Thu, 20 Mar 2008, Marc Karasek wrote:
...
I have gotten some cycles and they removed our proxy server
(hurrah!!!).  So I recompiled the BIOS fro SimNOW using buildrom on my
test machine.
When I went to run SimNOW is Seg Faults.  I tried it with the default
BIOS image for the Cheetah BSD, I also tried one of the other BSDs.  All
of them Seg Fault. :-(
I made the mistake of updating Fedora8_64 with the latest RPMs.   Lesson
learned, if it ain't broke don't fix it...
Does anyone have any idea what, I am guessing, package could be causing
this?  I have tried with both kernels that are on the machine, with no
success.  It did work at one point,  before the update.  I can nuke the
box and reinstall f8_64, but would rather not.
--

Marc Karasek
MTS
Sun Microsystems
mailto:marc.karasek@sun.com
ph:770.360.6415

-- 
Marc Jones
Senior Firmware Engineer
(970) 226-9684 Office
mailto:Marc.Jones@amd.com
http://www.amd.com/embeddedprocessors

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [coreboot] serengeti_cheetah_fam10: Erratas triple fault in SimNow