unstable AMD Fam10h boot

List overview All Threads
Download

newer

older

[v2] r4635 -...

Trac reminder: list of new...

Ralf Grosse Boerger

1 Sep 2009 1 Sep '09

9:15 p.m.

Hi,

this a reply to the following message: http://www.coreboot.org/pipermail/coreboot/2009-August/051629.html [I am not subscribed to this list.]

The sporadic boot problems ("FIXME! CPU Version unknown or not supported!") are caused by a race condition in Get_NB32().

This function performs a read operation to the PCI configuration space via port CF8/CFC.

u32 Get_NB32(u32 dev, u32 reg) { u32 addr;

addr = (dev>>4) | (reg & 0xFF) | ((reg & 0xf00)<<16); outl((1<<31) | (addr & ~3), 0xcf8);

return inl(0xcfc); }

As ports CF8/CFC are shared across cores (maybe even sockets?) concurrent accesses from different cores may yield random results.

This race condition is not limited to mctGetLogicalCPUID(), but should affect any PCI configuration space access. A real bugfix requires some sort of mutex, but mutexes are difficult to implement (this was already discusses on this list).

If the cores are started sequentially, concurrent PCI accesses can be avoided. I'll post example code as soon as I find some time...

Best Regards Ralf

Show replies by date

ron minnich

4 Sep 4 Sep

4:29 p.m.

On Tue, Sep 1, 2009 at 2:15 PM, Ralf Grosse Boergerralfgb@gmail.com wrote:

...

Hi,

this a reply to the following message: http://www.coreboot.org/pipermail/coreboot/2009-August/051629.html [I am not subscribed to this list.]

The sporadic boot problems ("FIXME! CPU Version unknown or not supported!") are caused by a race condition in Get_NB32().

This function performs a read operation to the PCI configuration space via port CF8/CFC.

u32 Get_NB32(u32 dev, u32 reg) { u32 addr;

addr = (dev>>4) | (reg & 0xFF) | ((reg & 0xf00)<<16); outl((1<<31) | (addr & ~3), 0xcf8);

return inl(0xcfc); }

As ports CF8/CFC are shared across cores (maybe even sockets?) concurrent accesses from different cores may yield random results.

I would be surprised were they shared across sockets but ... I'm realizing I have no clue how config cycles work on Opteron. I just assumed this cf8/cfc cycle was magically converted inside the cpu into an HT cycle of some sort, and that cycle was routed via the config space maps in the NB. But ... can someone inform me on how this really works? Is my picture even close?

Is there some better way on fam10 to do config cycles that is more multi-core friendly? It seems odd that we are still locked into this cf8/cfc stuff.

ron

Myles Watson

4:57 p.m.

...

...
As ports CF8/CFC are shared across cores (maybe even sockets?)

concurrent

...
accesses from different cores may yield random results.

I would be surprised were they shared across sockets but ... I'm realizing I have no clue how config cycles work on Opteron. I just assumed this cf8/cfc cycle was magically converted inside the cpu into an HT cycle of some sort, and that cycle was routed via the config space maps in the NB. But ... can someone inform me on how this really works? Is my picture even close?

I don't know how the conversion works exactly, or where it takes place, but the HT packet is a read or a write to 0xFD.FE00.0000 + an offset for the UnitID(pci device number). So, for device 7 on the bus and config register 0x14, you get 0xFDFE003814. You don't have to worry about bus numbers because they get taken care of based on the HT chain the to which the packet is routed.

Based on that, I would say it's not shared across sockets, but it definitely could be shared across cores.

...

Is there some better way on fam10 to do config cycles that is more multi-core friendly? It seems odd that we are still locked into this cf8/cfc stuff.

Isn't there a way to do MMCONF cycles from the NB?

Myles

ron minnich

5:06 p.m.

On Fri, Sep 4, 2009 at 9:57 AM, Myles Watsonmylesgw@gmail.com wrote:

...

...
...
As ports CF8/CFC are shared across cores (maybe even sockets?)

concurrent

...
accesses from different cores may yield random results.

I would be surprised were they shared across sockets but ... I'm realizing I have no clue how config cycles work on Opteron. I just assumed this cf8/cfc cycle was magically converted inside the cpu into an HT cycle of some sort, and that cycle was routed via the config space maps in the NB. But ... can someone inform me on how this really works? Is my picture even close?

I don't know how the conversion works exactly, or where it takes place, but the HT packet is a read or a write to 0xFD.FE00.0000 + an offset for the UnitID(pci device number). So, for device 7 on the bus and config register 0x14, you get 0xFDFE003814. You don't have to worry about bus numbers because they get taken care of based on the HT chain the to which the packet is routed.

This is a memory read/write?

...

Based on that, I would say it's not shared across sockets, but it definitely could be shared across cores.

...

Isn't there a way to do MMCONF cycles from the NB?

Marc? If there were, it would make sense to convert the code to use these, instead of trying to make cf8/cfc SMP-safe.

ron

Myles Watson

5:10 p.m.

...

...
I don't know how the conversion works exactly, or where it takes place,

but

...
the HT packet is a read or a write to 0xFD.FE00.0000 + an offset for the UnitID(pci device number). So, for device 7 on the bus and config

register

...
0x14, you get 0xFDFE003814. You don't have to worry about bus numbers because they get taken care of based on the HT chain the to which the

packet

...
is routed.

This is a memory read/write?

Yes. It's just to the reserved addresses for config space.

Myles

Rudolf Marek

5:07 p.m.

...

Isn't there a way to do MMCONF cycles from the NB?

No, I think only newer fam10h cpus can do that.

http://lkml.org/lkml/2007/12/21/134

So, the K8 internal devices are not accessible with the mmconf aperture which is NB dependent anyway. That post says that it may work for newer fam10h cpus. I think this is related: F1x[EC:E0] Configuration Map Registers and DisCohLdtCfg: disable coherent link configuration accesses.

I think linux has some test which test what devices can be accessed through type1 and what it sees on mmconf.

http://www.x86-64.org/pipermail/discuss/2005-December/007371.html

I think the only proper way is to do type1 and maybe some locking is necessary. Rudolf

Rudolf Marek

5:20 p.m.

Hi all,

I think I just found an answer:

2.11 Configuration Space

This BKDG chapter suggest that there is a MSR which can be used to do MMIO accesses.

MSRC001_0058 MMIO Configuration Base Address Register

Note that all cores should have this addr. Also it seems that this is what\ it should be used in ACPI instead of the NB MCFG aperture.

Rudolf

ron minnich

5:20 p.m.

On Fri, Sep 4, 2009 at 10:07 AM, Rudolf Marekr.marek@assembler.cz wrote:

...

I think the only proper way is to do type1 and maybe some locking is necessary.

but our thread subject is "unstable fam10h". Given that fam10h is the problem, and that it supports MMCONF, why not make new versions of the functions for processors that have MMCONF and use them on those processors?

We've never seen this kind of problem on K8 AFAIK. We can continue to use the old functions on those old CPUs.

So, what I'm trying to say: - we have a problem on fam10h - it seems to be a non-smp-safe function doing a config cycle - there are two ways to eliminate the problem o write a fam10 version of that function that will use MMCONF (will work on all later CPUs) o modify old function by adding a lock (i.e. stick with legacy mechanism for older CPUs)

I just can't see a good reason to stick with the type 1 access when the fam10h and, presumably all later families, will support MMCONF. The cf8/cfc is a 15-year-old idea (at least) that predates smp and multicore. We should be trying to eliminate that old mechanism whenever we can (at least it seems that way to me). It is the cf8/cfc mechanism that is the problem, not the lack of locking.

ron

Stefan Reinauer

5 Sep 5 Sep

5:36 p.m.

ron minnich wrote:

...

So, what I'm trying to say:

we have a problem on fam10h

it seems to be a non-smp-safe function doing a config cycle

there are two ways to eliminate the problem o write a fam10 version of that function that will use MMCONF (will

work on all later CPUs) o modify old function by adding a lock (i.e. stick with legacy mechanism for older CPUs)

Another idea would be to get rid of SMP setup in CAR stage. It sounds highly funky to me anyways.

- Why are we doing this anyways? o Is there a reason? o No other SMP system except K10 does this.

* How many ms do we benefit from that? (Honest question). Any at all?

Stefan

ron minnich

5:46 p.m.

On Sat, Sep 5, 2009 at 10:36 AM, Stefan Reinauerstepan@coresystems.de wrote:

...

Another idea would be to get rid of SMP setup in CAR stage. It sounds highly funky to me anyways.

Why are we doing this anyways?

o Is there a reason? o No other SMP system except K10 does this.

How many ms do we benefit from that? (Honest question). Any at all?

This may fix one problem, but it does not fix the general problem: using cf8/cfc is not going to be safe on multiple cores, from my understanding.

ron

Ward Vandewege

6:33 p.m.

On Sat, Sep 05, 2009 at 10:46:57AM -0700, ron minnich wrote:

...

On Sat, Sep 5, 2009 at 10:36 AM, Stefan Reinauerstepan@coresystems.de wrote:

...
Another idea would be to get rid of SMP setup in CAR stage. It sounds highly funky to me anyways.

Why are we doing this anyways?

o Is there a reason? o No other SMP system except K10 does this.

How many ms do we benefit from that? (Honest question). Any at all?

This may fix one problem, but it does not fix the general problem: using cf8/cfc is not going to be safe on multiple cores, from my understanding.

Not to complicate matters even further, but since we are talking about locking - will any of this improve the 'many cores talking to serial at once' problem?

Thanks, Ward.

-- Ward Vandewege ward@fsf.org Free Software Foundation - Senior Systems Administrator

Stefan Reinauer

7:13 p.m.

Ward Vandewege wrote:

...

On Sat, Sep 05, 2009 at 10:46:57AM -0700, ron minnich wrote:

...
On Sat, Sep 5, 2009 at 10:36 AM, Stefan Reinauerstepan@coresystems.de wrote:

...
Another idea would be to get rid of SMP setup in CAR stage. It sounds highly funky to me anyways.

Why are we doing this anyways? o Is there a reason? o No other SMP system except K10 does this.

How many ms do we benefit from that? (Honest question). Any at all?

This may fix one problem, but it does not fix the general problem: using cf8/cfc is not going to be safe on multiple cores, from my understanding.

Not to complicate matters even further, but since we are talking about locking - will any of this improve the 'many cores talking to serial at once' problem?

Yes, going non-parallel in CAR would. Finding a way to do locking via a (memory mapped) chipset register, would make it possible to fix that, too. With a lot of work. Just going MMCONF would not fix the printk thing.

ron minnich

6 Sep 6 Sep

5 a.m.

there's at least three problems here :-) 1. do we want to do SMP of any kind in CAR? I think not, but I'd like to hear Marc's opinion 2. There are techniques from the oldest days of PCI (cf8/cfc) that we use that don't work well in a multicore world. MMCONF, it seems, can help here. Using MMCONF for config cycles may only work for newer CPUs, but we really ought to move to something that doesn't require a shared resource such as cf8/cfc for more modern CPUs. 3. we need smp style locking code for printk.

Any more :-)

ron

Stefan Reinauer

8:58 a.m.

ron minnich wrote:

...

There are techniques from the oldest days of PCI (cf8/cfc) that we

use that don't work well in a multicore world.

Since there is only one set of PCI devices, I wonder what the benefit would be to "penetrate" them from all CPU cores or what would even cause that as a requirement.

...

we need smp style locking code for printk.

We have. But not in raminit.

Stefan

Peter Stuge

12:39 p.m.

Stefan Reinauer wrote:

...

Since there is only one set of PCI devices, I wonder what the benefit would be to "penetrate" them from all CPU cores or what would even cause that as a requirement.

RAM init working concurrently with multiple memory controllers - they're on PCI, right?

//Peter

Stefan Reinauer

7:40 p.m.

Peter Stuge wrote:

...

Stefan Reinauer wrote:

...
Since there is only one set of PCI devices, I wonder what the benefit would be to "penetrate" them from all CPU cores or what would even cause that as a requirement.

RAM init working concurrently with multiple memory controllers - they're on PCI, right?

Yes.

But why would it not be completely sufficient to set up all ram controllers in the system from the BSP?

Or, put differently.

We're smart and we fix the PCI problem. Then we suddenly notice that that is not enough. because a PCI operation is by no means an atomic operation. We're going to have to add another layer of locking on top of that, for example for SMBUS access, which might involve PCI access.

I'm saying we're opening a can of worms here, and unless we really like to go fishing we should close it again and walk in the dry.

Stefan

Peter Stuge

8:13 p.m.

Stefan Reinauer wrote:

...

...
RAM init working concurrently with multiple memory controllers - they're on PCI, right?

Yes.

But why would it not be completely sufficient to set up all ram controllers in the system from the BSP?

The big coreboot SMP win is with ECC scrubbing, right? Does that involve some PCI config space accesses to the memory controllers? Is it simpler to create locking for PCI accesses, or to split out the part of RAM init which sets up MCs into the BSP (as opposed to keeping the code for each MC running on that same core)?

...

I'm saying we're opening a can of worms here, and unless we really like to go fishing we should close it again and walk in the dry.

I'm thinking we're already hooked on that tasty fish? (By fish I mean SMP memory init.)

//Peter

ron minnich

10:32 p.m.

The way I see it the memory setup and SMP support in CAR are two very different issues.

BSP can do its own memory. Once that memory is set up the APs can use it. Thus, the APs can have working memory when they do their RAM setup. In other words, BSP does RAM setup in CAR APs can do RAM setup with working RAM -- they just use the BSP RAM, which is working.

The K8 code hints of this model, and, when I did my trial code for V3, this is how I set it up to work.

Hence, we can do SMP memory setup as long as the BSP sets up its own memory before it starts up the APs. We are really talking about SMP in CAR, which seems like a much harder issue.

Make sense? Something I'm missing?

ron

ebiederm＠xmission.com

8 Sep 8 Sep

1 p.m.

ron minnich rminnich@gmail.com writes:

...

The way I see it the memory setup and SMP support in CAR are two very different issues.

BSP can do its own memory. Once that memory is set up the APs can use it. Thus, the APs can have working memory when they do their RAM setup. In other words, BSP does RAM setup in CAR APs can do RAM setup with working RAM -- they just use the BSP RAM, which is working.

The K8 code hints of this model, and, when I did my trial code for V3, this is how I set it up to work.

Hence, we can do SMP memory setup as long as the BSP sets up its own memory before it starts up the APs. We are really talking about SMP in CAR, which seems like a much harder issue.

Make sense? Something I'm missing?

Long ago and far away. When I did the K8 code here is what I recall of my reasoning.

The only operation that benefited from being parallel was the clearing ECC memory so it had consistent ECC bits. Everything else works just dandy from the BSP, and in fact because of the way the K8 memory layout works you have to do all of the heavy lifting on a single cpu so that you can place all of memory into one nice area for the mtrrs and the like.

If the K10 has gotten as far as true cpu hotoplug support things may be more decoupled now but I would be surprised if that mattered in any real configuration.

The only thing that I ever had the other cpus starting earlier for and this was pretty fundamental was to assign them their local apic id's and put them to sleep. After making that code work I never put a print statement in there or did anything fancy. There is just nothing in there to make parallelism any more than an nuisance.

A big chunk of what has to happen very early is setting up hypertransport and enabling routing between the cpus. As I recall some point at the end of setting up hypertransport routing the secondary cpus all come online.

With the K7 AMD actually had a model where both of the cpus started booting at once and you read a northbridge register to see which one should be primary the first read of that register returned 0 all subsequent reads return 1 (or visa versa). If you didn't read that register first you got to sleep. The K8 had a very similar model except only one processor was every connected up as a bootstrap processor in practice, and if you aren't connected up as a bootstrap processor you sleep until the bootsrap processor setups up your hypertransport.

SMP in coreboot (except where required) is a bad idea. There are no performance wins (unless you need to initialize memory with writes) and it is an unnecessary complication. So konk the other cpus on the head as quickly as you can and go single processor.

Eric

Marc Jones

5:02 p.m.

On Sun, Sep 6, 2009 at 4:32 PM, ron minnichrminnich@gmail.com wrote:

...

The way I see it the memory setup and SMP support in CAR are two very different issues.

This bug is totally my fault...

Yes, Memory setup and SMP CAR are two different issues. The SMP setup happens during CAR is to setup microcode, HT and FIDVID prior to the PLL reset and memory setup.

All the SMP PCI config space access should be MMIO. It is the first thing that is enabled in CPU init in set_pci_mmio_conf_reg().

The bug is that I mixed a mem setup function in with SMP setup by using mctGetLogicalCPUID() which uses Get_NB32. As pointed out, the GET_NB32 is a cf8/cfc function. The mct code ported from AGESA assumes that it is running on the BSP only and uses cf8/cfc..... (historical k8 bug I think)

I think that I should change the mct PCI config functions to call the coreboot pci_read_config32 functions that handle MMIO vs cfc/cf8 nicely. This should future proof mct functions in CAR and a step toward SMP memory setup.

Some of that mct code PCI config space code is a little funny (ok, a lot funny), so it will take a little care. I should be able work patch in a couple of days.

Marc

-- http://marcjonesconsulting.com

Marc Jones

14 Sep 14 Sep

12:46 a.m.

On Tue, Sep 8, 2009 at 11:02 AM, Marc Jones marcj303@gmail.com wrote:

...

On Sun, Sep 6, 2009 at 4:32 PM, ron minnichrminnich@gmail.com wrote:

...
The way I see it the memory setup and SMP support in CAR are two very different issues.

This bug is totally my fault...

Yes, Memory setup and SMP CAR are two different issues. The SMP setup happens during CAR is to setup microcode, HT and FIDVID prior to the PLL reset and memory setup.

All the SMP PCI config space access should be MMIO. It is the first thing that is enabled in CPU init in set_pci_mmio_conf_reg().

The bug is that I mixed a mem setup function in with SMP setup by using mctGetLogicalCPUID() which uses Get_NB32. As pointed out, the GET_NB32 is a cf8/cfc function. The mct code ported from AGESA assumes that it is running on the BSP only and uses cf8/cfc..... (historical k8 bug I think)

I think that I should change the mct PCI config functions to call the coreboot pci_read_config32 functions that handle MMIO vs cfc/cf8 nicely. This should future proof mct functions in CAR and a step toward SMP memory setup.

Some of that mct code PCI config space code is a little funny (ok, a lot funny), so it will take a little care. I should be able work patch in a couple of days.

Here is a patch that fixes the cf8 config access. Not complicated like I initially recalled. Thanks to Ralf for pointing out the bug.

This needs testing. Anyone?

Signed-off-by: Marc Jones marcj303@gmail.com

Thanks, Marc

-- http://marcjonesconsulting.com

Ward Vandewege

2:13 a.m.

On Sun, Sep 13, 2009 at 06:46:38PM -0600, Marc Jones wrote:

...

Here is a patch that fixes the cf8 config access. Not complicated like I initially recalled. Thanks to Ralf for pointing out the bug.

This needs testing. Anyone?

We've got a couple H8DME/fam10 boxes coming this week, so I should be able to test this in a couple days. Will do as soon as we get the hardware.

Thanks, Ward.

-- Ward Vandewege ward@fsf.org Free Software Foundation - Senior Systems Administrator

Marc Jones

4:45 p.m.

I am not sure what I was thinking last night. This is really simple... There is no address manipulation to be done before calling the coreboot pci functions. Thanks Patrick....

Marc

-- http://marcjonesconsulting.com

Myles Watson

4:49 p.m.

On Mon, Sep 14, 2009 at 10:45 AM, Marc Jones marcj303@gmail.com wrote:

...

I am not sure what I was thinking last night. This is really simple... There is no address manipulation to be done before calling the coreboot pci functions. Thanks Patrick....

It looks like GetNB and SetNB should die. Is there a purpose for having the extra indirection?

Thanks, Myles

Peter Stuge

4:52 p.m.

Myles Watson wrote:

...

It looks like GetNB and SetNB should die.

//Peter

Marc Jones

4:58 p.m.

On Mon, Sep 14, 2009 at 10:52 AM, Peter Stuge peter@stuge.se wrote:

...

Myles Watson wrote:

...
It looks like GetNB and SetNB should die.

+1

I guess it should. The only reason to keep it is for easily diffing against the AMD AGESA reference code, but I am not certain if or when that will happen again.

Marc

-- http://marcjonesconsulting.com

Patrick Georgi

4:50 p.m.

Am Montag, den 14.09.2009, 10:45 -0600 schrieb Marc Jones:

...

I am not sure what I was thinking last night. This is really simple... There is no address manipulation to be done before calling the coreboot pci functions. Thanks Patrick....

The board boots much more reliable now, thank you!

Acked-by: Patrick Georgi patrick.georgi@coresystems.de

Marc Jones

5 p.m.

On Mon, Sep 14, 2009 at 10:50 AM, Patrick Georgi patrick@georgi-clan.de wrote:

...

Am Montag, den 14.09.2009, 10:45 -0600 schrieb Marc Jones:

...
I am not sure what I was thinking last night. This is really simple... There is no address manipulation to be done before calling the coreboot pci functions. Thanks Patrick....

The board boots much more reliable now, thank you!

Acked-by: Patrick Georgi patrick.georgi@coresystems.de

Thanks Patrick.

r4633

-- http://marcjonesconsulting.com

Bao, Zheng

15 Sep 15 Sep

1:51 a.m.

Are you sure the pci functions will cover the case that the address is more than 0x100?

Zheng

-----Original Message----- From: coreboot-bounces@coreboot.org [mailto:coreboot-bounces@coreboot.org] On Behalf Of Marc Jones Sent: Tuesday, September 15, 2009 12:45 AM To: ron minnich Cc: coreboot@coreboot.org Subject: Re: [coreboot] unstable AMD Fam10h boot

I am not sure what I was thinking last night. This is really simple... There is no address manipulation to be done before calling the coreboot pci functions. Thanks Patrick....

Marc

-- http://marcjonesconsulting.com

Marc Jones

4:10 p.m.

On Mon, Sep 14, 2009 at 7:51 PM, Bao, Zheng Zheng.Bao@amd.com wrote:

...

Are you sure the pci functions will cover the case that the address is more than 0x100?

It should, unless you know something I don't (bug?). Using the MMIO config access is the preferred method since it enforces the ordering. See 2.11 Configuration Space in the BKDG.

Marc

-- http://marcjonesconsulting.com

Stefan Reinauer

8 Sep 8 Sep

5:29 p.m.

Peter Stuge wrote:

...

Stefan Reinauer wrote:

...
...
RAM init working concurrently with multiple memory controllers - they're on PCI, right?

Yes.

But why would it not be completely sufficient to set up all ram controllers in the system from the BSP?

The big coreboot SMP win is with ECC scrubbing, right?

Yes, but that does not happen until we're in stage2. It's not really part of memory init.

...

Does that involve some PCI config space accesses to the memory controllers?

I don't think so.

ron minnich

6 Sep 6 Sep

7:52 p.m.

On Tue, Sep 1, 2009 at 2:15 PM, Ralf Grosse Boergerralfgb@gmail.com wrote:

...

Hi,

this a reply to the following message: http://www.coreboot.org/pipermail/coreboot/2009-August/051629.html [I am not subscribed to this list.]

The sporadic boot problems ("FIXME! CPU Version unknown or not supported!") are caused by a race condition in Get_NB32().

This function performs a read operation to the PCI configuration space via port CF8/CFC.

u32 Get_NB32(u32 dev, u32 reg) { u32 addr;

addr = (dev>>4) | (reg & 0xFF) | ((reg & 0xf00)<<16); outl((1<<31) | (addr & ~3), 0xcf8);

return inl(0xcfc); }

As ports CF8/CFC are shared across cores (maybe even sockets?) concurrent accesses from different cores may yield random results.

OK, let's start this discussion again.

Can we at least answer this question. Ports CF8/CFC are shared across - sockets - cores

I am betting they are not shared across sockets, and would be surprised if they are shared across cores but am willing to believe it.

Anybody?

ron

5358

days inactive

5372

days old

coreboot@coreboot.org

31 comments

11 participants

tags (0)

participants (11)

Bao, Zheng
ebiederm＠xmission.com
Marc Jones
Myles Watson
Patrick Georgi
Peter Stuge
Ralf Grosse Boerger
ron minnich
Rudolf Marek
Stefan Reinauer
Ward Vandewege