[PATCH] acpi: hide 64-bit PCI hole for Windows XP

List overview All Threads
Download

newer

older

Re: [SeaBIOS] [Qemu-devel] Test...

[PATCH v4] apm: fix shutdown

Igor Mammedov

30 Jul 2013 30 Jul '13

9 a.m.

Windows XP (32 and 64 bit) crashes if 64-bit PCI hole is present. Use _OSI ACPI method to blacklist it and hide 64-bit PCI hole.

_OSI strings reported by XP are taken from here: http://msdn.microsoft.com/library/windows/hardware/gg463275

Signed-off-by: Igor Mammedov imammedo@redhat.com --- src/acpi-dsdt.dsl | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl index 158f6b4..7b55636 100644 --- a/src/acpi-dsdt.dsl +++ b/src/acpi-dsdt.dsl @@ -33,6 +33,31 @@ DefinitionBlock (

/**************************************************************** + * OS detection + ****************************************************************/ + + Scope(_SB) { + External(\P1V, IntObj) + + Method(_INI) { + If (CondRefOf (_OSI, Local0)) { + /* disable 64-bit PCI window for Windows XP and its variants */ + If (_OSI ("Windows 2001")) { + Store (Zero, P1V) + } + + If (_OSI ("Windows 2001 SP1")) { + Store (Zero, P1V) + } + + If (_OSI ("Windows 2001 SP2")) { + Store (Zero, P1V) + } + } + } + } + +/**************************************************************** * PCI Bus definition ****************************************************************/

-- 1.7.1

Show replies by date

Michael S. Tsirkin

30 Jul 30 Jul

1:35 p.m.

On Tue, Jul 30, 2013 at 09:00:38AM +0200, Igor Mammedov wrote:

...

Windows XP (32 and 64 bit) crashes if 64-bit PCI hole is present. Use _OSI ACPI method to blacklist it and hide 64-bit PCI hole.

_OSI strings reported by XP are taken from here: http://msdn.microsoft.com/library/windows/hardware/gg463275

Signed-off-by: Igor Mammedov imammedo@redhat.com

Hmm did you test this with other OS-es? OSI matches multiple strings so this will often disable the 64 bit memory for newer windows or e.g. for linux.

See e.g. ftp://ftp.suse.com/pub/people/trenn/ACPI_BIOS_on_Linux_guide/acpi_guideline_for_vendors.pdf

Could we use _OS instead? That's a single string, so it will only match a specific OS.

...

src/acpi-dsdt.dsl | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl index 158f6b4..7b55636 100644 --- a/src/acpi-dsdt.dsl +++ b/src/acpi-dsdt.dsl @@ -33,6 +33,31 @@ DefinitionBlock (

/****************************************************************
OS detection

****************************************************************/

Scope(_SB) {
   External(\P1V, IntObj)
   Method(_INI) {
       If (CondRefOf (\_OSI, Local0)) {
           /* disable 64-bit PCI window for Windows XP and its variants */
           If (\_OSI ("Windows 2001")) {
               Store (Zero, P1V)
           }
           If (\_OSI ("Windows 2001 SP1")) {
               Store (Zero, P1V)
           }
           If (\_OSI ("Windows 2001 SP2")) {
               Store (Zero, P1V)
           }
       }
   }
}
+/****************************************************************

PCI Bus definition

****************************************************************/

-- 1.7.1

Igor Mammedov

3:10 p.m.

On Tue, 30 Jul 2013 14:35:25 +0300 "Michael S. Tsirkin" mst@redhat.com wrote:

...

On Tue, Jul 30, 2013 at 09:00:38AM +0200, Igor Mammedov wrote:

...
Windows XP (32 and 64 bit) crashes if 64-bit PCI hole is present. Use _OSI ACPI method to blacklist it and hide 64-bit PCI hole.

_OSI strings reported by XP are taken from here: http://msdn.microsoft.com/library/windows/hardware/gg463275

Signed-off-by: Igor Mammedov imammedo@redhat.com

Hmm did you test this with other OS-es? OSI matches multiple strings so this will often disable the 64 bit memory for newer windows or e.g. for linux.

See e.g. ftp://ftp.suse.com/pub/people/trenn/ACPI_BIOS_on_Linux_guide/acpi_guideline_for_vendors.pdf

according to "3.2 How OSI is implemented on Linux" _OSI is broken on linux, but I guess we should accommodate this buggy behavior for compatibility reasons.

_OS could be better way to do it, but I haven't found any MS document specifying _OS values for Windows flavors.

So far from testing of Windows OSes only (32|64bit) Windows XP and Windows Server 2003 [R2] crash when 64-bit PCI hole is present, the rest installs/boots just fine.

...

Could we use _OS instead? That's a single string, so it will only match a specific OS.

...
src/acpi-dsdt.dsl | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl index 158f6b4..7b55636 100644 --- a/src/acpi-dsdt.dsl +++ b/src/acpi-dsdt.dsl @@ -33,6 +33,31 @@ DefinitionBlock (

/****************************************************************
OS detection

****************************************************************/

Scope(_SB) {
   External(\P1V, IntObj)
   Method(_INI) {
       If (CondRefOf (\_OSI, Local0)) {
           /* disable 64-bit PCI window for Windows XP and its variants */
           If (\_OSI ("Windows 2001")) {
               Store (Zero, P1V)
           }
           If (\_OSI ("Windows 2001 SP1")) {
               Store (Zero, P1V)
           }
           If (\_OSI ("Windows 2001 SP2")) {
               Store (Zero, P1V)
           }
       }
   }
}
+/****************************************************************

PCI Bus definition

****************************************************************/

-- 1.7.1

Michael S. Tsirkin

3:18 p.m.

On Tue, Jul 30, 2013 at 03:10:02PM +0200, Igor Mammedov wrote:

...

On Tue, 30 Jul 2013 14:35:25 +0300 "Michael S. Tsirkin" mst@redhat.com wrote:

...
On Tue, Jul 30, 2013 at 09:00:38AM +0200, Igor Mammedov wrote:

...
Windows XP (32 and 64 bit) crashes if 64-bit PCI hole is present. Use _OSI ACPI method to blacklist it and hide 64-bit PCI hole.

_OSI strings reported by XP are taken from here: http://msdn.microsoft.com/library/windows/hardware/gg463275

Signed-off-by: Igor Mammedov imammedo@redhat.com

Hmm did you test this with other OS-es? OSI matches multiple strings so this will often disable the 64 bit memory for newer windows or e.g. for linux.

See e.g. ftp://ftp.suse.com/pub/people/trenn/ACPI_BIOS_on_Linux_guide/acpi_guideline_for_vendors.pdf

according to "3.2 How OSI is implemented on Linux" _OSI is broken on linux, but I guess we should accommodate this buggy behavior for compatibility reasons.

_OS could be better way to do it, but I haven't found any MS document specifying _OS values for Windows flavors.

Yes but we can work it out.

...

So far from testing of Windows OSes only (32|64bit) Windows XP and Windows Server 2003 [R2] crash when 64-bit PCI hole is present, the rest installs/boots just fine.

Right. That's why I'm thinking _OS might be better.

...

...
Could we use _OS instead? That's a single string, so it will only match a specific OS.

...
src/acpi-dsdt.dsl | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl index 158f6b4..7b55636 100644 --- a/src/acpi-dsdt.dsl +++ b/src/acpi-dsdt.dsl @@ -33,6 +33,31 @@ DefinitionBlock (

/****************************************************************
OS detection

****************************************************************/

Scope(_SB) {
   External(\P1V, IntObj)
   Method(_INI) {
       If (CondRefOf (\_OSI, Local0)) {
           /* disable 64-bit PCI window for Windows XP and its variants */
           If (\_OSI ("Windows 2001")) {
               Store (Zero, P1V)
           }
           If (\_OSI ("Windows 2001 SP1")) {
               Store (Zero, P1V)
           }
           If (\_OSI ("Windows 2001 SP2")) {
               Store (Zero, P1V)
           }
       }
   }
}
+/****************************************************************

PCI Bus definition

****************************************************************/

-- 1.7.1

Igor Mammedov

5:26 p.m.

On Tue, 30 Jul 2013 16:18:29 +0300 "Michael S. Tsirkin" mst@redhat.com wrote:

...

On Tue, Jul 30, 2013 at 03:10:02PM +0200, Igor Mammedov wrote:

...
On Tue, 30 Jul 2013 14:35:25 +0300 "Michael S. Tsirkin" mst@redhat.com wrote:

...
On Tue, Jul 30, 2013 at 09:00:38AM +0200, Igor Mammedov wrote:

...
Windows XP (32 and 64 bit) crashes if 64-bit PCI hole is present. Use _OSI ACPI method to blacklist it and hide 64-bit PCI hole.

_OSI strings reported by XP are taken from here: http://msdn.microsoft.com/library/windows/hardware/gg463275

Signed-off-by: Igor Mammedov imammedo@redhat.com

Hmm did you test this with other OS-es? OSI matches multiple strings so this will often disable the 64 bit memory for newer windows or e.g. for linux.

See e.g. ftp://ftp.suse.com/pub/people/trenn/ACPI_BIOS_on_Linux_guide/acpi_guideline_for_vendors.pdf

according to "3.2 How OSI is implemented on Linux" _OSI is broken on linux, but I guess we should accommodate this buggy behavior for compatibility reasons.

_OS could be better way to do it, but I haven't found any MS document specifying _OS values for Windows flavors.

Yes but we can work it out.

...
So far from testing of Windows OSes only (32|64bit) Windows XP and Windows Server 2003 [R2] crash when 64-bit PCI hole is present, the rest installs/boots just fine.

Right. That's why I'm thinking _OS might be better.

_OS is even worse it returns "Microsoft Windows NT" on linux, XP, WS2012R2 ... so it's not usable either.

...

...
...
Could we use _OS instead? That's a single string, so it will only match a specific OS.

...
src/acpi-dsdt.dsl | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl index 158f6b4..7b55636 100644 --- a/src/acpi-dsdt.dsl +++ b/src/acpi-dsdt.dsl @@ -33,6 +33,31 @@ DefinitionBlock (

/****************************************************************
OS detection

****************************************************************/

Scope(_SB) {
   External(\P1V, IntObj)
   Method(_INI) {
       If (CondRefOf (\_OSI, Local0)) {
           /* disable 64-bit PCI window for Windows XP and its variants */
           If (\_OSI ("Windows 2001")) {
               Store (Zero, P1V)
           }
           If (\_OSI ("Windows 2001 SP1")) {
               Store (Zero, P1V)
           }
           If (\_OSI ("Windows 2001 SP2")) {
               Store (Zero, P1V)
           }
       }
   }
}
+/****************************************************************

PCI Bus definition

****************************************************************/

-- 1.7.1

Michael S. Tsirkin

31 Jul 31 Jul

7:59 a.m.

On Tue, Jul 30, 2013 at 09:00:38AM +0200, Igor Mammedov wrote:

...

Windows XP (32 and 64 bit) crashes if 64-bit PCI hole is present. Use _OSI ACPI method to blacklist it and hide 64-bit PCI hole.

_OSI strings reported by XP are taken from here: http://msdn.microsoft.com/library/windows/hardware/gg463275

Signed-off-by: Igor Mammedov imammedo@redhat.com

So to expand on what I said, just reading this document:

" Windows operating systems return 0xFFFFFFFF if the argument to the _OSI method specifies an earlier version of Windows. For example, Windows 7 returns 0xFFFFFFFF for both “Windows 2009” (Windows 7) and “Windows 2006” (Windows Vista®)."

This would imply that all moder windows versions return 0xFFFFFFFF in response to querying Windows 2001.

To see what a modern Linux supports, see drivers/acpi/acpica/utosi.c

Looks like the result of this patch is to always disable the 64 bit window?

...

src/acpi-dsdt.dsl | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl index 158f6b4..7b55636 100644 --- a/src/acpi-dsdt.dsl +++ b/src/acpi-dsdt.dsl @@ -33,6 +33,31 @@ DefinitionBlock (

/****************************************************************
OS detection

****************************************************************/

Scope(_SB) {
   External(\P1V, IntObj)
   Method(_INI) {
       If (CondRefOf (\_OSI, Local0)) {
           /* disable 64-bit PCI window for Windows XP and its variants */
           If (\_OSI ("Windows 2001")) {
               Store (Zero, P1V)
           }
           If (\_OSI ("Windows 2001 SP1")) {
               Store (Zero, P1V)
           }
           If (\_OSI ("Windows 2001 SP2")) {
               Store (Zero, P1V)
           }
       }
   }
}
+/****************************************************************

PCI Bus definition

****************************************************************/

-- 1.7.1

Igor Mammedov

8:14 a.m.

On Wed, 31 Jul 2013 08:59:59 +0300 "Michael S. Tsirkin" mst@redhat.com wrote:

...

On Tue, Jul 30, 2013 at 09:00:38AM +0200, Igor Mammedov wrote:

...
Windows XP (32 and 64 bit) crashes if 64-bit PCI hole is present. Use _OSI ACPI method to blacklist it and hide 64-bit PCI hole.

_OSI strings reported by XP are taken from here: http://msdn.microsoft.com/library/windows/hardware/gg463275

Signed-off-by: Igor Mammedov imammedo@redhat.com

So to expand on what I said, just reading this document:

" Windows operating systems return 0xFFFFFFFF if the argument to the _OSI method specifies an earlier version of Windows. For example, Windows 7 returns 0xFFFFFFFF for both “Windows 2009” (Windows 7) and “Windows 2006” (Windows Vista®)."

This would imply that all moder windows versions return 0xFFFFFFFF in response to querying Windows 2001.

To see what a modern Linux supports, see drivers/acpi/acpica/utosi.c

Looks like the result of this patch is to always disable the 64 bit window?

Yep, patch is not valid.

...

...
src/acpi-dsdt.dsl | 25 +++++++++++++++++++++++++ 1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/src/acpi-dsdt.dsl b/src/acpi-dsdt.dsl index 158f6b4..7b55636 100644 --- a/src/acpi-dsdt.dsl +++ b/src/acpi-dsdt.dsl @@ -33,6 +33,31 @@ DefinitionBlock (

/****************************************************************
OS detection

****************************************************************/

Scope(_SB) {
   External(\P1V, IntObj)
   Method(_INI) {
       If (CondRefOf (\_OSI, Local0)) {
           /* disable 64-bit PCI window for Windows XP and its variants */
           If (\_OSI ("Windows 2001")) {
               Store (Zero, P1V)
           }
           If (\_OSI ("Windows 2001 SP1")) {
               Store (Zero, P1V)
           }
           If (\_OSI ("Windows 2001 SP2")) {
               Store (Zero, P1V)
           }
       }
   }
}
+/****************************************************************

PCI Bus definition

****************************************************************/

-- 1.7.1

Paolo Bonzini

5 Aug 5 Aug

6:04 p.m.

On 07/31/2013 08:14 AM, Igor Mammedov wrote:

...

...
" Windows operating systems return 0xFFFFFFFF if the argument to the _OSI method specifies an earlier version of Windows. For example, Windows 7 returns 0xFFFFFFFF for both “Windows 2009” (Windows 7) and “Windows 2006” (Windows Vista®)."

This would imply that all moder windows versions return 0xFFFFFFFF in response to querying Windows 2001.

To see what a modern Linux supports, see drivers/acpi/acpica/utosi.c

Looks like the result of this patch is to always disable the 64 bit window?

Yep, patch is not valid.

Did you look further into this? It would be useful for pvpanic as well.

From my quick research, it looks like "Windows 2006" || "Windows 2006.1" || "Linux" would work, but I have not tested it.

Paolo

Michael S. Tsirkin

8:16 p.m.

On Mon, Aug 05, 2013 at 06:04:04PM +0200, Paolo Bonzini wrote:

...

On 07/31/2013 08:14 AM, Igor Mammedov wrote:

...
...
" Windows operating systems return 0xFFFFFFFF if the argument to the _OSI method specifies an earlier version of Windows. For example, Windows 7 returns 0xFFFFFFFF for both “Windows 2009” (Windows 7) and “Windows 2006” (Windows Vista®)."

This would imply that all moder windows versions return 0xFFFFFFFF in response to querying Windows 2001.

To see what a modern Linux supports, see drivers/acpi/acpica/utosi.c

Looks like the result of this patch is to always disable the 64 bit window?

Yep, patch is not valid.

Did you look further into this? It would be useful for pvpanic as well.

From my quick research, it looks like "Windows 2006" || "Windows 2006.1" || "Linux" would work, but I have not tested it.

Paolo

This doesn't work in that it matches linux. I'm looking into fixing this, yes.

ATM it looks like we should test "Windows 2000" || "Windows 2001" || "Windows 2001 SP1" || "Windows 2001.1 SP1"

&& !( "Windows 2006" || "Windows 2006.1" || "Windows 2006 SP1" || "Windows 2006 SP2" || "Windows 2009" || "Windows 2012" || "Linux" || "FreeBSD" ) && _OS == "Microsoft Windows NT" && _REV == 0x1

This should match XP and 2003 as tightly as possible. Please note "Linux" is there just in case, modern Linux OSPM does not identify itself as "Linux". Same for "FreeBSD".

However, any query of _OSI can change OSPM behaviour in unpredictable ways. So this is a risky change, and far from reliable.

For these reasons, in my opinion it's best to avoid relying on _OSI as a sole way to identify the OS, and allow management interface as a fallback. In particular, this applies to the pvpanic device - _OSI hacks without an override in management has a good chance to backfire.

-- MST

Paolo Bonzini

9:11 p.m.

...

...
From my quick research, it looks like "Windows 2006" || "Windows 2006.1" || "Linux" would work, but I have not tested it.

Paolo

This doesn't work in that it matches linux.

Note that the above was meant to be a condition for when to _show_ the PCI hole, i.e. negated compared to your example.

...

ATM it looks like we should test "Windows 2000" || "Windows 2001" || "Windows 2001 SP1" || "Windows 2001.1 SP1"

Including this may be too strict, what about 98/ME?

...

&& !( "Windows 2006" || "Windows 2006.1" ||

We know that these are all implied by the following four:

...

    "Windows 2006 SP1" ||
    "Windows 2006 SP2" ||
    "Windows 2009" ||
    "Windows 2012" ||

So it is not necessary to test these four.

...

"Linux" || "FreeBSD" ) && _OS == "Microsoft Windows NT" && _REV == 0x1

Testing _OS and _REV is probably too strict.

...

This should match XP and 2003 as tightly as possible. Please note "Linux" is there just in case, modern Linux OSPM does not identify itself as "Linux".

Yeah, I know. I didn't know about FreeBSD, and I agree it is better to include it just in case.

Paolo

Fred .

10:34 p.m.

Maybe rename Windows 2001, 2006, and 2009 to their real names? Or use their NT kernel name instead, like NT 6.1, NT 6.2, etc. Or add a comment in the source code like: “Windows 2009” /* Windows 7 */ || "Windows 2012" /* Windows Server 2012 */ ||

On Mon, Aug 5, 2013 at 9:11 PM, Paolo Bonzini pbonzini@redhat.com wrote:

...

...
...
From my quick research, it looks like "Windows 2006" || "Windows 2006.1" || "Linux" would work, but I have not tested it.

Paolo

This doesn't work in that it matches linux.

Note that the above was meant to be a condition for when to _show_ the PCI hole, i.e. negated compared to your example.

...
ATM it looks like we should test "Windows 2000" || "Windows 2001" || "Windows 2001 SP1" || "Windows 2001.1 SP1"

Including this may be too strict, what about 98/ME?

...
  && !(
    "Windows 2006" ||
    "Windows 2006.1" ||
We know that these are all implied by the following four:

...
    "Windows 2006 SP1" ||
    "Windows 2006 SP2" ||
    "Windows 2009" ||
    "Windows 2012" ||
So it is not necessary to test these four.

...
  "Linux" ||
  "FreeBSD"
  ) &&
  _OS == "Microsoft Windows NT"
  &&
  _REV == 0x1
Testing _OS and _REV is probably too strict.

...
This should match XP and 2003 as tightly as possible. Please note "Linux" is there just in case, modern Linux OSPM does not identify itself as "Linux".

Yeah, I know. I didn't know about FreeBSD, and I agree it is better to include it just in case.

Paolo

SeaBIOS mailing list SeaBIOS@seabios.org http://www.seabios.org/mailman/listinfo/seabios

Michael S. Tsirkin

6 Aug 6 Aug

4:39 p.m.

On Mon, Aug 05, 2013 at 03:11:41PM -0400, Paolo Bonzini wrote:

...

...
...
From my quick research, it looks like "Windows 2006" || "Windows 2006.1" || "Linux" would work, but I have not tested it.

Paolo

This doesn't work in that it matches linux.

Note that the above was meant to be a condition for when to _show_ the PCI hole, i.e. negated compared to your example.

...
ATM it looks like we should test "Windows 2000" || "Windows 2001" || "Windows 2001 SP1" || "Windows 2001.1 SP1"

Including this may be too strict, what about 98/ME?

Isn't this past EOL? I think we shouldn't touch whatever we don't know about.

...

...
&& !( "Windows 2006" || "Windows 2006.1" ||

We know that these are all implied by the following four:

...
    "Windows 2006 SP1" ||
    "Windows 2006 SP2" ||
    "Windows 2009" ||
    "Windows 2012" ||
So it is not necessary to test these four.

True, but I don't see how this can harm us, and I'm trying to check as much as possible.

...

...
"Linux" || "FreeBSD" ) && _OS == "Microsoft Windows NT" && _REV == 0x1

Testing _OS and _REV is probably too strict.

Why too strict? We want to only affect very specific guests. whatever we don't know about, let's not touch it.

...

...
This should match XP and 2003 as tightly as possible. Please note "Linux" is there just in case, modern Linux OSPM does not identify itself as "Linux".

Yeah, I know. I didn't know about FreeBSD, and I agree it is better to include it just in case.

Paolo

Paolo Bonzini

6:43 p.m.

...

...
...
ATM it looks like we should test "Windows 2000" || "Windows 2001" || "Windows 2001 SP1" || "Windows 2001.1 SP1"

Including this may be too strict, what about 98/ME?

Isn't this past EOL?

So what? People try to use it with QEMU, and it's fair to assume it's worse than XP.

...

...
...
&& !( "Windows 2006" || "Windows 2006.1" ||

We know that these are all implied by the following four:

...
    "Windows 2006 SP1" ||
    "Windows 2006 SP2" ||
    "Windows 2009" ||
    "Windows 2012" ||
So it is not necessary to test these four.
True, but I don't see how this can harm us, and I'm trying to check as much as possible.

Fair enough.

...

...
...
"Linux" || "FreeBSD" ) && _OS == "Microsoft Windows NT" && _REV == 0x1

Testing _OS and _REV is probably too strict.

Why too strict? We want to only affect very specific guests. whatever we don't know about, let's not touch it.

In practice all OSes we care about will disguise themselves as Windows. I checked Solaris now and it follows Linux's lead: http://fxr.watson.org/fxr/source/intel/io/acpica/utilities/uteval.c?v=OPENSO...

For whatever we don't know about, why should we assume 64-bit BARs work? Especially considering it's likely to be pretty old guests.

Paolo

Michael S. Tsirkin

6:58 p.m.

On Tue, Aug 06, 2013 at 12:43:04PM -0400, Paolo Bonzini wrote:

...

...
...
...
ATM it looks like we should test "Windows 2000" || "Windows 2001" || "Windows 2001 SP1" || "Windows 2001.1 SP1"

Including this may be too strict, what about 98/ME?

Isn't this past EOL?

So what? People try to use it with QEMU, and it's fair to assume it's worse than XP.

I agree it's likely but I don't think we should presume anything until it's tested.

...

...
...
...
&& !( "Windows 2006" || "Windows 2006.1" ||

We know that these are all implied by the following four:

...
    "Windows 2006 SP1" ||
    "Windows 2006 SP2" ||
    "Windows 2009" ||
    "Windows 2012" ||
So it is not necessary to test these four.
True, but I don't see how this can harm us, and I'm trying to check as much as possible.
Fair enough.

...
...
...
"Linux" || "FreeBSD" ) && _OS == "Microsoft Windows NT" && _REV == 0x1

Testing _OS and _REV is probably too strict.

Why too strict? We want to only affect very specific guests. whatever we don't know about, let's not touch it.

In practice all OSes we care about will disguise themselves as Windows. I checked Solaris now and it follows Linux's lead: http://fxr.watson.org/fxr/source/intel/io/acpica/utilities/uteval.c?v=OPENSO...

I'm still worried about using _OSI. This makes it a risky change, guests do change behaviour depending on which _OSI are called. No way to tell what this will do without lots of testing.

I have an alternative idea: it looks like XP/2003 are the only OS-es which have _REV set to 0x1 which is the ancient API 1.0 spec.

Maybe it's enough to check _REV == 1 and _OS == windows ?

That's certainly lower risk from this POV, but need to check old Linux and other guests.

...

For whatever we don't know about, why should we assume 64-bit BARs work? Especially considering it's likely to be pretty old guests.

Paolo

There's no need to assume 64 bit BARs works. But I think we can assume guests don't crash. What windows does here is very unusual imho, guest should just say "I can't use this range so I won't" and allocate BARs somewhere else.

-- MST

Gerd Hoffmann

7 Aug 7 Aug

9:29 a.m.

Hi,

...

...
For whatever we don't know about, why should we assume 64-bit BARs work? Especially considering it's likely to be pretty old guests.

...

There's no need to assume 64 bit BARs works. But I think we can assume guests don't crash. What windows does here is very unusual imho, guest should just say "I can't use this range so I won't" and allocate BARs somewhere else.

Why windows crashes or what it should do doesn't matter. We have to deal with it.

Today we have a pretty simple logic in seabios: When we run out of space in the 32bit pci memory window we'll enable the 64bit window and try to move 64bit bars there. So with a usual configuration there is no 64bit window simply because it isn't needed. WinXP is happy. If you plug in pci devices with large 64bit bars the 64bit window shows up.

So, how about simply keeping that logic?

We can drop the etc/pci-info logic then.

We'll need some way for seabios to fill in the pci window information into the qemu-provided tables. Easiest way to do that would be to extend the COMMAND_ADD_POINTER bios linker script command.

We probably also need some knob asking seabios to enable the 64bit window unconditionally, so we have some address space for hotplugging pci devices with large bars.

Comments?

cheers, Gerd

Michael S. Tsirkin

11:50 a.m.

On Wed, Aug 07, 2013 at 09:29:39AM +0200, Gerd Hoffmann wrote:

...

Hi,

...
...
For whatever we don't know about, why should we assume 64-bit BARs work? Especially considering it's likely to be pretty old guests.

...
There's no need to assume 64 bit BARs works. But I think we can assume guests don't crash. What windows does here is very unusual imho, guest should just say "I can't use this range so I won't" and allocate BARs somewhere else.

Why windows crashes or what it should do doesn't matter. We have to deal with it.

Today we have a pretty simple logic in seabios: When we run out of space in the 32bit pci memory window we'll enable the 64bit window and try to move 64bit bars there. So with a usual configuration there is no 64bit window simply because it isn't needed. WinXP is happy. If you plug in pci devices with large 64bit bars the 64bit window shows up. So, how about simply keeping that logic? We can drop the etc/pci-info logic then.

I don't think this is a good idea.

First, windows XP currently crashes if you supply a device with a large BAR. That's not a friendly way to tell user their OS doesn't support the device, not enabling the device is better.

Second, current logic has *no chance* to support hotplug of devices with large BARs. We will put work arounds for XP in place to just disable 64 bit hole, but for example with windows 7, 32 and 64 bit versions both support a 64 bit hole but different max sizes, and there's no easy way to distinguish between them.

Third, there are good reasons for pci-info, and windows bugs don't outweight them. In particular, there's duplicated logic in bios and QEMU, if you change QEMU to be closer to real hardware, kaboom.

Last, it's a maintainance problem that we need drivers for all chipsets open-coded in BIOS. One suggestion we had is moving completely away and supplying some bytecode for what's left of chipset configuration (e.g. memcfg) to seabios.

...

We'll need some way for seabios to fill in the pci window information into the qemu-provided tables. Easiest way to do that would be to extend the COMMAND_ADD_POINTER bios linker script command.

We probably also need some knob asking seabios to enable the 64bit window unconditionally, so we have some address space for hotplugging pci devices with large bars.

I think you got it wrong. The common case is enabling the 64 bit window. The default should be to enable it, with a knob to disable.

We must also supply the 64 bit window size as the limits differ between windows guests. That's exactly the etc/pci-info logic.

...

Comments?

cheers, Gerd

Gerd Hoffmann

12:29 p.m.

On 08/07/13 11:50, Michael S. Tsirkin wrote:

...

On Wed, Aug 07, 2013 at 09:29:39AM +0200, Gerd Hoffmann wrote:

...
Hi,

...
...
For whatever we don't know about, why should we assume 64-bit BARs work? Especially considering it's likely to be pretty old guests.

...
There's no need to assume 64 bit BARs works. But I think we can assume guests don't crash. What windows does here is very unusual imho, guest should just say "I can't use this range so I won't" and allocate BARs somewhere else.

Why windows crashes or what it should do doesn't matter. We have to deal with it.

Today we have a pretty simple logic in seabios: When we run out of space in the 32bit pci memory window we'll enable the 64bit window and try to move 64bit bars there. So with a usual configuration there is no 64bit window simply because it isn't needed. WinXP is happy. If you plug in pci devices with large 64bit bars the 64bit window shows up. So, how about simply keeping that logic? We can drop the etc/pci-info logic then.

I don't think this is a good idea.

First, windows XP currently crashes if you supply a device with a large BAR. That's not a friendly way to tell user their OS doesn't support the device, not enabling the device is better.

That logic isn't there today. seabios will simply panic, before it prints anything on the screen, which is even more unfriendly than a blue screen.

And even when "not enabling the device" is implemented: If that device happens to be qxl (one of the few qemu-emulated devices which actually can have a large 64bit bar) the user would have a guest without display too.

...

Second, current logic has *no chance* to support hotplug of devices with large BARs. We will put work arounds for XP in place to just disable 64 bit hole, but for example with windows 7, 32 and 64 bit versions both support a 64 bit hole but different max sizes, and there's no easy way to distinguish between them.

--verbose please (but see also below).

...

Third, there are good reasons for pci-info, and windows bugs don't outweight them. In particular, there's duplicated logic in bios and QEMU, if you change QEMU to be closer to real hardware, kaboom.

--verbose please. Which logic is duplicated?

...

Last, it's a maintainance problem that we need drivers for all chipsets open-coded in BIOS. One suggestion we had is moving completely away and supplying some bytecode for what's left of chipset configuration (e.g. memcfg) to seabios.

Sorry, but I fail to see the problem. There isn't that much chipset-specific code in seabios, and it rarely needs changes. Adding a bytecode interpreter to be able to rip it out sounds like overkill to me.

...

...
We probably also need some knob asking seabios to enable the 64bit window unconditionally, so we have some address space for hotplugging pci devices with large bars.

I think you got it wrong. The common case is enabling the 64 bit window. The default should be to enable it, with a knob to disable.

Why enable by default? I would be conservative here. I have yet to see real hardware with a 64bit pci window. Things which are rare or don't exist on real hardware have a higher chance to trigger guest bugs.

...

We must also supply the 64 bit window size as the limits differ between windows guests. That's exactly the etc/pci-info logic.

pci-info does more than just the 64bit window size.

cheers, Gerd

Michael S. Tsirkin

1:10 p.m.

On Wed, Aug 07, 2013 at 12:29:32PM +0200, Gerd Hoffmann wrote:

...

On 08/07/13 11:50, Michael S. Tsirkin wrote:

...
On Wed, Aug 07, 2013 at 09:29:39AM +0200, Gerd Hoffmann wrote:

...
Hi,

...
...
For whatever we don't know about, why should we assume 64-bit BARs work? Especially considering it's likely to be pretty old guests.

...
There's no need to assume 64 bit BARs works. But I think we can assume guests don't crash. What windows does here is very unusual imho, guest should just say "I can't use this range so I won't" and allocate BARs somewhere else.

Why windows crashes or what it should do doesn't matter. We have to deal with it.

Today we have a pretty simple logic in seabios: When we run out of space in the 32bit pci memory window we'll enable the 64bit window and try to move 64bit bars there. So with a usual configuration there is no 64bit window simply because it isn't needed. WinXP is happy. If you plug in pci devices with large 64bit bars the 64bit window shows up. So, how about simply keeping that logic? We can drop the etc/pci-info logic then.

I don't think this is a good idea.

First, windows XP currently crashes if you supply a device with a large BAR. That's not a friendly way to tell user their OS doesn't support the device, not enabling the device is better.

That logic isn't there today. seabios will simply panic, before it prints anything on the screen, which is even more unfriendly than a blue screen. And even when "not enabling the device" is implemented: If that device happens to be qxl (one of the few qemu-emulated devices which actually can have a large 64bit bar) the user would have a guest without display too.

VFIO also comes to mind - you might need to boot from this device. bios can also often do the smart thing and enable boot devices, disabling others.

But anyway, with latest QEMU this is user's mistake: user specifies the values for pci-info and user also specified set of devices inconsistent with that. So management which knows exactly which version of windows is installed can create the correct configuration.

...

...
Second, current logic has *no chance* to support hotplug of devices with large BARs. We will put work arounds for XP in place to just disable 64 bit hole, but for example with windows 7, 32 and 64 bit versions both support a 64 bit hole but different max sizes, and there's no easy way to distinguish between them.

--verbose please (but see also below).

msdn.microsoft.com/en-us/library/windows/desktop/aa366778(v=vs.85).asp

Generally, windows tends to crash when CRS resources exceed the supported limits of physical memory (sometimes with weird off by one errors, e.g. win7 32 bit seems to survive with a 36 bit pci hole even though this means the max address is 2^37-1 which it can't address).

...

...
Third, there are good reasons for pci-info, and windows bugs don't outweight them. In particular, there's duplicated logic in bios and QEMU, if you change QEMU to be closer to real hardware, kaboom.

--verbose please. Which logic is duplicated?

The one setting the PCI hole addresses :)

...

...
Last, it's a maintainance problem that we need drivers for all chipsets open-coded in BIOS. One suggestion we had is moving completely away and supplying some bytecode for what's left of chipset configuration (e.g. memcfg) to seabios.

Sorry, but I fail to see the problem. There isn't that much chipset-specific code in seabios, and it rarely needs changes. Adding a bytecode interpreter to be able to rip it out sounds like overkill to me.

Possibly, though I shouldn't really have said bytecode - more a set of address/data pairs to write into config space (if you like, it can be a loader extension). But let's discuss this separately.

...

...
...
We probably also need some knob asking seabios to enable the 64bit window unconditionally, so we have some address space for hotplugging pci devices with large bars.

I think you got it wrong. The common case is enabling the 64 bit window. The default should be to enable it, with a knob to disable.

Why enable by default? I would be conservative here. I have yet to see real hardware with a 64bit pci window. Things which are rare or don't exist on real hardware have a higher chance to trigger guest bugs.

In fact, it's up to QEMU whether to enable it by default. Yes we could be even more aggressive, and tweak 398489018183d613306ab022653552247d93919f to make the default 0 and not 2G.

If we do this, we can take the time with designing the proper fix for winXP - this just shows that pci-info is the right approach.

...

...
We must also supply the 64 bit window size as the limits differ between windows guests. That's exactly the etc/pci-info logic.

pci-info does more than just the 64bit window size.

cheers, Gerd

Yes. It does exactly what's needed to make sure PCI windows match the addresses that hardware listens on. Selecting the size is just one aspect of it but it's covered too, and this way management can make the decision.

Gerd Hoffmann

2:15 p.m.

Hi,

...

But anyway, with latest QEMU this is user's mistake: user specifies the values for pci-info and user also specified set of devices inconsistent with that.

qemu doesn't error out though, so what is the point?

...

...
...
Third, there are good reasons for pci-info, and windows bugs don't outweight them. In particular, there's duplicated logic in bios and QEMU, if you change QEMU to be closer to real hardware, kaboom.

--verbose please. Which logic is duplicated?

The one setting the PCI hole addresses :)

That wasn't duplicated before you've added pci-info.

...

...
...
Last, it's a maintainance problem that we need drivers for all chipsets open-coded in BIOS. One suggestion we had is moving completely away and supplying some bytecode for what's left of chipset configuration (e.g. memcfg) to seabios.

Sorry, but I fail to see the problem. There isn't that much chipset-specific code in seabios, and it rarely needs changes. Adding a bytecode interpreter to be able to rip it out sounds like overkill to me.

Possibly, though I shouldn't really have said bytecode - more a set of address/data pairs to write into config space (if you like, it can be a loader extension). But let's discuss this separately.

Isn't going to fly for coreboot. Initialization of real hardware doesn't work that way, and on coreboot qemu support shares the piix4/ich9 setup code with real mainboards.

...

...
Why enable by default? I would be conservative here. I have yet to see real hardware with a 64bit pci window. Things which are rare or don't exist on real hardware have a higher chance to trigger guest bugs.

In fact, it's up to QEMU whether to enable it by default. Yes we could be even more aggressive, and tweak 398489018183d613306ab022653552247d93919f to make the default 0 and not 2G.

If we do this, we can take the time with designing the proper fix for winXP - this just shows that pci-info is the right approach.

This only proves that you can fix the bug introduced by adding pci-info.

Current seabios which completely ignores pci-info works equally well with WinXP.

...

...
...
We must also supply the 64 bit window size as the limits differ between windows guests. That's exactly the etc/pci-info logic.

pci-info does more than just the 64bit window size.

cheers, Gerd

Yes. It does exactly what's needed to make sure PCI windows match the addresses that hardware listens on.

seabios handles this just fine today.

The only missing bit is some way to request seabios add a 64bit window (of a certain size) unconditionally, so it is there even in case it isn't needed for the devices present at boot time. A simple etc/pci-64bit-size integer would do the job.

cheers, Gerd

Michael S. Tsirkin

2:35 p.m.

On Wed, Aug 07, 2013 at 02:15:14PM +0200, Gerd Hoffmann wrote:

...

Hi,

...
But anyway, with latest QEMU this is user's mistake: user specifies the values for pci-info and user also specified set of devices inconsistent with that.

qemu doesn't error out though, so what is the point?

The point is management can set the size according to the type of guest that will be installed. You can misconfigure qemu in a variety of ways and not all of them cause it to error out.

...

...
...
...
Third, there are good reasons for pci-info, and windows bugs don't outweight them. In particular, there's duplicated logic in bios and QEMU, if you change QEMU to be closer to real hardware, kaboom.

--verbose please. Which logic is duplicated?

The one setting the PCI hole addresses :)

That wasn't duplicated before you've added pci-info.

It was always duplicated. With pci-info it's *not* duplicated anymore: qemu is the single source of info.

...

...
...
...
Last, it's a maintainance problem that we need drivers for all chipsets open-coded in BIOS. One suggestion we had is moving completely away and supplying some bytecode for what's left of chipset configuration (e.g. memcfg) to seabios.

Sorry, but I fail to see the problem. There isn't that much chipset-specific code in seabios, and it rarely needs changes. Adding a bytecode interpreter to be able to rip it out sounds like overkill to me.

Possibly, though I shouldn't really have said bytecode - more a set of address/data pairs to write into config space (if you like, it can be a loader extension). But let's discuss this separately.

Isn't going to fly for coreboot. Initialization of real hardware doesn't work that way, and on coreboot qemu support shares the piix4/ich9 setup code with real mainboards.

Good to know.

...

...
...
Why enable by default? I would be conservative here. I have yet to see real hardware with a 64bit pci window. Things which are rare or don't exist on real hardware have a higher chance to trigger guest bugs.

In fact, it's up to QEMU whether to enable it by default. Yes we could be even more aggressive, and tweak 398489018183d613306ab022653552247d93919f to make the default 0 and not 2G.

If we do this, we can take the time with designing the proper fix for winXP - this just shows that pci-info is the right approach.

This only proves that you can fix the bug introduced by adding pci-info.

Current seabios which completely ignores pci-info works equally well with WinXP.

It doesn't really, and the issue is not WinXP.

Here's a bug, it's easy to reproduce: 1. boot qemu with existing seabios, linux guest 2. add device with large BAR by hotplug (e.g. ivshmem)

Actual result: device isn't allocated a BAR and does not work Expected result device is allocated a 64 bit BAR and works

So the "works just fine" is really only "works for me".

Yes, broken guests seem to be common, so we should be conservative with out defaults, and let management control what happens.

...

...
...
...
We must also supply the 64 bit window size as the limits differ between windows guests. That's exactly the etc/pci-info logic.

pci-info does more than just the 64bit window size.

cheers, Gerd

Yes. It does exactly what's needed to make sure PCI windows match the addresses that hardware listens on.

seabios handles this just fine today.

Only because qemu and seabios have identical code for PCI window. But that code has no basis in reality, if we fix QEMU to be closer to real hardware it will break.

...

The only missing bit is some way to request seabios add a 64bit window (of a certain size) unconditionally, so it is there even in case it isn't needed for the devices present at boot time. A simple etc/pci-64bit-size integer would do the job.

cheers, Gerd

And then it will have exactly the same issue if you misconfigure it. The 64bit hole size is the only thing that crashes old windows. So if you want that, then what's your problem with pci-info really?

Gerd Hoffmann

4:21 p.m.

On 08/07/13 14:35, Michael S. Tsirkin wrote:

...

On Wed, Aug 07, 2013 at 02:15:14PM +0200, Gerd Hoffmann wrote:

...
qemu doesn't error out though, so what is the point?

The point is management can set the size according to the type of guest that will be installed. You can misconfigure qemu in a variety of ways and not all of them cause it to error out.

Then seabios will still panic. I fail to see how pci-info makes things more user-friendly as you've claimed a few mails back.

No doubt about the need to configure the 64bit window size.

...

...
...
The one setting the PCI hole addresses :)

That wasn't duplicated before you've added pci-info.

It was always duplicated. With pci-info it's *not* duplicated anymore: qemu is the single source of info.

Huh? The 32bit window is sized according to the installed memory. That logic is in seabios and you'll try to move it to qemu, using pci-info. It wasn't in qemu before ...

...

...
...
If we do this, we can take the time with designing the proper fix for winXP - this just shows that pci-info is the right approach.

This only proves that you can fix the bug introduced by adding pci-info.

Current seabios which completely ignores pci-info works equally well with WinXP.

It doesn't really, and the issue is not WinXP.

Here's a bug, it's easy to reproduce:

boot qemu with existing seabios, linux guest

add device with large BAR by hotplug (e.g. ivshmem)

Yes, to enable large bar hotplug we need some way to configure the size of the 64bit pci window. I didn't question that.

...

...
...
Yes. It does exactly what's needed to make sure PCI windows match the addresses that hardware listens on.

seabios handles this just fine today.

Only because qemu and seabios have identical code for PCI window. But that code has no basis in reality, if we fix QEMU to be closer to real hardware it will break.

What exactly? q35 has fixed addresses for the 32bit window, do you mean that? If needed we can make it look at the installed low (below 4g) memory instead, like piix4 does.

...

...
The only missing bit is some way to request seabios add a 64bit window (of a certain size) unconditionally, so it is there even in case it isn't needed for the devices present at boot time. A simple etc/pci-64bit-size integer would do the job.

cheers, Gerd

And then it will have exactly the same issue if you misconfigure it. The 64bit hole size is the only thing that crashes old windows. So if you want that, then what's your problem with pci-info really?

pci-info it has stuff which shouldn't be there IMO.

It is the job of the firmware to initialize the hardware. IMO it should work that way on qemu too to mimic real hardware.

The firmware can initialize the hardware as it likes. For the most part the OS can simply read the configuration from the pci config space.

Some hardware addresses are in the acpi tables though. Current seabios generates/patches acpi tables to handle that.

With qemu providing the acpi tables we need some other way to handle that. You are trying to tackle that by sticking the addresses into pci-info, then expecting the firmware using them. I think we should handle that by extending COMMAND_ADD_POINTER, so the firmware can continue to pick the addresses as it pleases.

The addresses we are talking about here are:

(1) 32bit pci window start + end + size (for PCI0._CRS) (2) 64bit pci window start + end + size (likewise) (3) mmconf xbar start (MCFG, q35 only, at 0xb0000000 now). (4) pmbase (FADT, at 0xb000 now).

Especially 3+4 tend to be compile-time constants in the firmware as they are needed very early in the setup process. So making them runtime configurable via fw_cfg (say by adding them to pci-info) isn't easy. Thats why I think we better should do it the other way around, let firmware pick the addresses and define some way to fixup the acpi tables.

pci-info has the side effect that you can configure the pci window size that way, so we'll need a replacement for that part when dropping pci-info.

cheers, Gerd

Michael S. Tsirkin

4:53 p.m.

On Wed, Aug 07, 2013 at 04:21:44PM +0200, Gerd Hoffmann wrote:

...

On 08/07/13 14:35, Michael S. Tsirkin wrote:

...
On Wed, Aug 07, 2013 at 02:15:14PM +0200, Gerd Hoffmann wrote:

...
qemu doesn't error out though, so what is the point?

The point is management can set the size according to the type of guest that will be installed. You can misconfigure qemu in a variety of ways and not all of them cause it to error out.

Then seabios will still panic. I fail to see how pci-info makes things more user-friendly as you've claimed a few mails back. No doubt about the need to configure the 64bit window size.

We need to fix it not to panic, but we put the user in control since windows size can now come from user.

...

...
...
...
The one setting the PCI hole addresses :)

That wasn't duplicated before you've added pci-info.

It was always duplicated. With pci-info it's *not* duplicated anymore: qemu is the single source of info.

Huh? The 32bit window is sized according to the installed memory. That logic is in seabios and you'll try to move it to qemu, using pci-info. It wasn't in qemu before ...

The logic is in hw/i386/pc_piix.c and always was.

...

...
...
...
If we do this, we can take the time with designing the proper fix for winXP - this just shows that pci-info is the right approach.

This only proves that you can fix the bug introduced by adding pci-info.

Current seabios which completely ignores pci-info works equally well with WinXP.

It doesn't really, and the issue is not WinXP.

Here's a bug, it's easy to reproduce:

boot qemu with existing seabios, linux guest

add device with large BAR by hotplug (e.g. ivshmem)

Yes, to enable large bar hotplug we need some way to configure the size of the 64bit pci window. I didn't question that.

...
...
...
Yes. It does exactly what's needed to make sure PCI windows match the addresses that hardware listens on.

seabios handles this just fine today.

Only because qemu and seabios have identical code for PCI window. But that code has no basis in reality, if we fix QEMU to be closer to real hardware it will break.

What exactly? q35 has fixed addresses for the 32bit window, do you mean that? If needed we can make it look at the installed low (below 4g) memory instead, like piix4 does.

...
...
The only missing bit is some way to request seabios add a 64bit window (of a certain size) unconditionally, so it is there even in case it isn't needed for the devices present at boot time. A simple etc/pci-64bit-size integer would do the job.

cheers, Gerd

And then it will have exactly the same issue if you misconfigure it. The 64bit hole size is the only thing that crashes old windows. So if you want that, then what's your problem with pci-info really?

pci-info it has stuff which shouldn't be there IMO.

It is the job of the firmware to initialize the hardware. IMO it should work that way on qemu too to mimic real hardware.

That's exactly what we have. However QEMU is the one that knows how hardware is configured (e.g. where's the PCI hole), so we report that to bios.

...

The firmware can initialize the hardware as it likes. For the most part the OS can simply read the configuration from the pci config space.

Only for some hardware (Q35).

...

Some hardware addresses are in the acpi tables though. Current seabios generates/patches acpi tables to handle that.

With qemu providing the acpi tables we need some other way to handle that. You are trying to tackle that by sticking the addresses into pci-info, then expecting the firmware using them. I think we should handle that by extending COMMAND_ADD_POINTER, so the firmware can continue to pick the addresses as it pleases.

The addresses we are talking about here are:

(1) 32bit pci window start + end + size (for PCI0._CRS) (2) 64bit pci window start + end + size (likewise) (3) mmconf xbar start (MCFG, q35 only, at 0xb0000000 now). (4) pmbase (FADT, at 0xb000 now).

Especially 3+4 tend to be compile-time constants in the firmware as they are needed very early in the setup process.

So we don't need them in pci-config, just stick constant in ACPI.

...

So making them runtime configurable via fw_cfg (say by adding them to pci-info) isn't easy. Thats why I think we better should do it the other way around, let firmware pick the addresses and define some way to fixup the acpi tables.

pci-info has the side effect that you can configure the pci window size that way, so we'll need a replacement for that part when dropping pci-info.

cheers, Gerd

That's not the only issue. Another issue is where to start/end the 32 bit window for PIIX. That's really a PV thing for QEMU, hardcoded at the moment in bios and QEMU.

-- MST

Kevin O'Connor

8 Aug 8 Aug

1:58 a.m.

On Wed, Aug 07, 2013 at 05:53:12PM +0300, Michael S. Tsirkin wrote:

...

On Wed, Aug 07, 2013 at 04:21:44PM +0200, Gerd Hoffmann wrote:

...
On 08/07/13 14:35, Michael S. Tsirkin wrote:

...
On Wed, Aug 07, 2013 at 02:15:14PM +0200, Gerd Hoffmann wrote:

...
qemu doesn't error out though, so what is the point?

The point is management can set the size according to the type of guest that will be installed. You can misconfigure qemu in a variety of ways and not all of them cause it to error out.

Then seabios will still panic. I fail to see how pci-info makes things more user-friendly as you've claimed a few mails back. No doubt about the need to configure the 64bit window size.

We need to fix it not to panic, but we put the user in control since windows size can now come from user.

I'm confused. Why would seabios panic?

...

...
...
And then it will have exactly the same issue if you misconfigure it. The 64bit hole size is the only thing that crashes old windows. So if you want that, then what's your problem with pci-info really?

pci-info it has stuff which shouldn't be there IMO.

It is the job of the firmware to initialize the hardware. IMO it should work that way on qemu too to mimic real hardware.

That's exactly what we have. However QEMU is the one that knows how hardware is configured (e.g. where's the PCI hole), so we report that to bios.

It was my understanding that the PCI hole size is determined by hardware, and thus it makes sense for QEMU to send this to seabios. (If not the hardware itself, the memory controller configuration, which is effectively the same thing as we wont be adding full emulation of a memory controller to qemu/seabios.) Correct me if I'm wrong.

Also, I don't understand the underlying reason for this patch. If a user asks to start QEMU with a PCI device with a very large BAR, then winxp wont work. But, if a user put a real PCI device with a very large BAR in a real computer then they'd get the same failure. Why is this a qemu/seabios issue - shouldn't the user simply not do this?

-Kevin

Michael S. Tsirkin

7:42 a.m.

On Wed, Aug 07, 2013 at 07:58:18PM -0400, Kevin O'Connor wrote:

...

On Wed, Aug 07, 2013 at 05:53:12PM +0300, Michael S. Tsirkin wrote:

...
On Wed, Aug 07, 2013 at 04:21:44PM +0200, Gerd Hoffmann wrote:

...
On 08/07/13 14:35, Michael S. Tsirkin wrote:

...
On Wed, Aug 07, 2013 at 02:15:14PM +0200, Gerd Hoffmann wrote:

...
qemu doesn't error out though, so what is the point?

The point is management can set the size according to the type of guest that will be installed. You can misconfigure qemu in a variety of ways and not all of them cause it to error out.

Then seabios will still panic. I fail to see how pci-info makes things more user-friendly as you've claimed a few mails back. No doubt about the need to configure the 64bit window size.

We need to fix it not to panic, but we put the user in control since windows size can now come from user.

I'm confused. Why would seabios panic?

Question is, what to do if we can't fit all BARs into the range supplied by QEMU?

...

...
...
...
And then it will have exactly the same issue if you misconfigure it. The 64bit hole size is the only thing that crashes old windows. So if you want that, then what's your problem with pci-info really?

pci-info it has stuff which shouldn't be there IMO.

It is the job of the firmware to initialize the hardware. IMO it should work that way on qemu too to mimic real hardware.

That's exactly what we have. However QEMU is the one that knows how hardware is configured (e.g. where's the PCI hole), so we report that to bios.

It was my understanding that the PCI hole size is determined by hardware, and thus it makes sense for QEMU to send this to seabios. (If not the hardware itself, the memory controller configuration, which is effectively the same thing as we wont be adding full emulation of a memory controller to qemu/seabios.) Correct me if I'm wrong.

This is certainly true for PIIX. It's more complicated for Q35 since the hole start address is configurable by firmware. Gerd refers to mch_setup in src/pciinit.c

...

Also, I don't understand the underlying reason for this patch. If a user asks to start QEMU with a PCI device with a very large BAR, then winxp wont work. But, if a user put a real PCI device with a very large BAR in a real computer then they'd get the same failure. Why is this a qemu/seabios issue - shouldn't the user simply not do this?

-Kevin

Sure. But if we can find a simple low risk way to make winXP not allocate resources rather than crash, that's a bit nicer, isn't it?

-- MST

Gerd Hoffmann

9:33 a.m.

Hi,

...

...
...
Then seabios will still panic. I fail to see how pci-info makes things more user-friendly as you've claimed a few mails back. No doubt about the need to configure the 64bit window size.

We need to fix it not to panic, but we put the user in control since windows size can now come from user.

I'm confused. Why would seabios panic?

It does in case it can't fit the pci bars into the I/O windows. Right now we don't have some fallback mechanism like dropping the device with the largest bar from the list, then try again.

...

...
...
It is the job of the firmware to initialize the hardware. IMO it should work that way on qemu too to mimic real hardware.

That's exactly what we have. However QEMU is the one that knows how hardware is configured (e.g. where's the PCI hole), so we report that to bios.

It was my understanding that the PCI hole size is determined by hardware, and thus it makes sense for QEMU to send this to seabios. (If not the hardware itself, the memory controller configuration, which is effectively the same thing as we wont be adding full emulation of a memory controller to qemu/seabios.) Correct me if I'm wrong.

As I understand it you just need some free address space. For 32bit pci bars this actually is a "hole" in the memory map, i.e. some room below 4G which isn't backed by memory. 64bit pci bars can be pretty much anywhere where they don't conflict with memory (well, in theory, in practice it makes sense to group them in a address space window for a number of reasons).

On piix4 seabios gets the installed low memory from cmos to figure where the free 32bit address space starts so it knows where it can place the pci bars.

On q35 the 32bit pci window is simply hardcoded (same in qemu and seabios), but we can surely change that and make q35 look at the cmos too. Or memory controller config if we'll go emulate that some day.

64bit bars are placed above the installed high memory.

coreboot ressource management doesn't know "pci holes" btw. Instead you'll go register the fixed ressources (io ports, memory) and coreboot goes find non-conflicting places for your pci bars. On q35 machines with 1G of memory it figures there is room between 0x40000000 (end of ram) and 0xb0000000 (start of mmconf xbar) and goes place the pci bars there. Then the linux kernel figures this is outside the window declared via _CRS and goes remap the bars so they are above 0xc0000000.

...

Also, I don't understand the underlying reason for this patch. If a user asks to start QEMU with a PCI device with a very large BAR, then winxp wont work. But, if a user put a real PCI device with a very large BAR in a real computer then they'd get the same failure. Why is this a qemu/seabios issue - shouldn't the user simply not do this?

Well, qemu/seabios clearly needs some configuration knob for this. On physical hardware there probably would be some bios setup option to enable/disable the 64bit window and to configure the size. We'll need some aequivalent in fw_cfg.

I don't think pci-info is the solution though. For the 32bit window we already have the interfaces needed (lookup low memory in cmos), there is no need for another one. For the 64bit window we should IMO just pass the size wanted to the firmware.

cheers, Gerd

Gerd Hoffmann

9:57 a.m.

Hi,

...

...
Huh? The 32bit window is sized according to the installed memory. That logic is in seabios and you'll try to move it to qemu, using pci-info. It wasn't in qemu before ...

The logic is in hw/i386/pc_piix.c and always was.

What exactly you are refering to?

...

...
It is the job of the firmware to initialize the hardware. IMO it should work that way on qemu too to mimic real hardware.

That's exactly what we have. However QEMU is the one that knows how hardware is configured (e.g. where's the PCI hole), so we report that to bios.

Memory configuration is in the cmos, firmware can figure where it can place pci devices from that. There is no need for a new interface.

...

...
The firmware can initialize the hardware as it likes. For the most part the OS can simply read the configuration from the pci config space.

Only for some hardware (Q35).

--verbose please.

...

...
(3) mmconf xbar start (MCFG, q35 only, at 0xb0000000 now). (4) pmbase (FADT, at 0xb000 now).

Especially 3+4 tend to be compile-time constants in the firmware as they are needed very early in the setup process.

So we don't need them in pci-config, just stick constant in ACPI.

I don't want them be constant. I want allow the firmware pick them. Our mmconfig xbar is 256M and can handle 256 busses. I'd like to have the option to reduce that to 64M and place it somewhere else.

Also coreboot and seabios use different values for pmbase. coreboot on q35 maps the pmbase below 0x1000. Which surely makes sense. When we don't place chipset stuff at 0xb000 we can assign the 0xb000->0xbfff window to a pci bridge instead.

...

Another issue is where to start/end the 32 bit window for PIIX. That's really a PV thing for QEMU, hardcoded at the moment in bios and QEMU.

Hmm? --verbose please. What issue to we have with the 32bit window?

cheers, Gerd

Michael S. Tsirkin

10:22 a.m.

On Thu, Aug 08, 2013 at 09:57:39AM +0200, Gerd Hoffmann wrote:

...

Hi,

...
...
Huh? The 32bit window is sized according to the installed memory. That logic is in seabios and you'll try to move it to qemu, using pci-info. It wasn't in qemu before ...

The logic is in hw/i386/pc_piix.c and always was.

What exactly you are refering to?

pc_init1 which picks addresses and passes them on to i440fx_init.

...

...
...
It is the job of the firmware to initialize the hardware. IMO it should work that way on qemu too to mimic real hardware.

That's exactly what we have. However QEMU is the one that knows how hardware is configured (e.g. where's the PCI hole), so we report that to bios.

Memory configuration is in the cmos, firmware can figure where it can place pci devices from that. There is no need for a new interface.

The assumption being that whatever is not memory is PCI? I'm not sure that's right.

...

...
...
The firmware can initialize the hardware as it likes. For the most part the OS can simply read the configuration from the pci config space.

Only for some hardware (Q35).

--verbose please.

i440fx_init gets pci_hole_start and pci_hole_end. This is in hardware (QEMU) and not configurable by firmware.

...

...
...
(3) mmconf xbar start (MCFG, q35 only, at 0xb0000000 now). (4) pmbase (FADT, at 0xb000 now).

Especially 3+4 tend to be compile-time constants in the firmware as they are needed very early in the setup process.

So we don't need them in pci-config, just stick constant in ACPI.

I don't want them be constant. I want allow the firmware pick them. Our mmconfig xbar is 256M and can handle 256 busses. I'd like to have the option to reduce that to 64M and place it somewhere else.

Also coreboot and seabios use different values for pmbase. coreboot on q35 maps the pmbase below 0x1000. Which surely makes sense. When we don't place chipset stuff at 0xb000 we can assign the 0xb000->0xbfff window to a pci bridge instead.

...
Another issue is where to start/end the 32 bit window for PIIX. That's really a PV thing for QEMU, hardcoded at the moment in bios and QEMU.

Hmm? --verbose please. What issue to we have with the 32bit window?

cheers, Gerd

I refer to PCI hole starting at 0xe0000000.

-- MST

Gerd Hoffmann

10:51 a.m.

On 08/08/13 10:22, Michael S. Tsirkin wrote:

...

On Thu, Aug 08, 2013 at 09:57:39AM +0200, Gerd Hoffmann wrote:

...
Hi,

...
...
Huh? The 32bit window is sized according to the installed memory. That logic is in seabios and you'll try to move it to qemu, using pci-info. It wasn't in qemu before ...

The logic is in hw/i386/pc_piix.c and always was.

What exactly you are refering to?

pc_init1 which picks addresses and passes them on to i440fx_init.

Yep. qemu figured where it wants map memory. The unused 32bit address space goes into the pci hole. cmos memory size is set accordingly. seabios gets the memory size from cmos, then it knows where the pci hole starts. seabios rounds it up (i.e. may leave some of it unused) to be able to cover the complete hole with a single mtrr entry, but that isn't a issue and can be changed if needed. The mtrr thing is more or less cosmetical anyway in a virtual machine.

...

...
Memory configuration is in the cmos, firmware can figure where it can place pci devices from that. There is no need for a new interface.

The assumption being that whatever is not memory is PCI? I'm not sure that's right.

Maybe not in general, but I'm pretty sure for the x86 chipsets we are emulating it is.

...

...
...
...
The firmware can initialize the hardware as it likes. For the most part the OS can simply read the configuration from the pci config space.

Only for some hardware (Q35).

--verbose please.

i440fx_init gets pci_hole_start and pci_hole_end. This is in hardware (QEMU) and not configurable by firmware.

pci_hole_start == end of low memory (available via cmos). pci_hole_end == 4G.

...

...
...
Another issue is where to start/end the 32 bit window for PIIX. That's really a PV thing for QEMU, hardcoded at the moment in bios and QEMU.

Hmm? --verbose please. What issue to we have with the 32bit window?

cheers, Gerd

I refer to PCI hole starting at 0xe0000000.

See above. It's not hardcoded in seabios. The upper limit for low memory is hardcoded in qemu, i.e. large guests with 4G or more memory get a hole starting at 0xe0000000. If you change that in qemu to be -- say -- 0xc0000000 instead seabios will handle it just fine.

You can see that today, just start a guest with 2G, 3G & 4G of memory, then watch the pci hole size adapting in /proc/iomem.

cheers, Gerd

Michael S. Tsirkin

11:37 a.m.

On Thu, Aug 08, 2013 at 10:51:59AM +0200, Gerd Hoffmann wrote:

...

On 08/08/13 10:22, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 09:57:39AM +0200, Gerd Hoffmann wrote:

...
Hi,

...
...
Huh? The 32bit window is sized according to the installed memory. That logic is in seabios and you'll try to move it to qemu, using pci-info. It wasn't in qemu before ...

The logic is in hw/i386/pc_piix.c and always was.

What exactly you are refering to?

pc_init1 which picks addresses and passes them on to i440fx_init.

Yep. qemu figured where it wants map memory. The unused 32bit address space goes into the pci hole. cmos memory size is set accordingly. seabios gets the memory size from cmos, then it knows where the pci hole starts. seabios rounds it up (i.e. may leave some of it unused) to be able to cover the complete hole with a single mtrr entry, but that isn't a issue and can be changed if needed. The mtrr thing is more or less cosmetical anyway in a virtual machine.

...
...
Memory configuration is in the cmos, firmware can figure where it can place pci devices from that. There is no need for a new interface.

The assumption being that whatever is not memory is PCI? I'm not sure that's right.

Maybe not in general, but I'm pretty sure for the x86 chipsets we are emulating it is.

I think this is the basic question.

Speaking about PIIX: http://download.intel.com/design/chipsets/datashts/29054901.pdf it only supported 1G RAM and 32 bit PCI.

What happened with RAM below 1G is this: top of RAM to 0xfec00000 is PCI - this is emulated correctly fec10000 to ffe00000 is PCI - this is not emulated correctly

What happens with RAM >1G is all PV, it doesn't exist on real hardware.

Re-adding qemu-devel. Can you please keep it Cc'd?

-- MST

Michael S. Tsirkin

10:37 a.m.

On Thu, Aug 08, 2013 at 09:57:39AM +0200, Gerd Hoffmann wrote:

...

...
...
(3) mmconf xbar start (MCFG, q35 only, at 0xb0000000 now). (4) pmbase (FADT, at 0xb000 now).

Especially 3+4 tend to be compile-time constants in the firmware as they are needed very early in the setup process.

So we don't need them in pci-config, just stick constant in ACPI.

I don't want them be constant. I want allow the firmware pick them. Our mmconfig xbar is 256M and can handle 256 busses. I'd like to have the option to reduce that to 64M and place it somewhere else.

Also coreboot and seabios use different values for pmbase. coreboot on q35 maps the pmbase below 0x1000. Which surely makes sense. When we don't place chipset stuff at 0xb000 we can assign the 0xb000->0xbfff window to a pci bridge instead.

Yes, this might be useful. But I don't think it's required to use linker to patch ACPI tables for this - we can write ASL code to read the register back from device configuration, instead.

Also, keep qemu-devel Cc'd on these discussions please.

-- MST

Gerd Hoffmann

10:57 a.m.

On 08/08/13 10:37, Michael S. Tsirkin wrote:

...

On Thu, Aug 08, 2013 at 09:57:39AM +0200, Gerd Hoffmann wrote:

...
...
...
(3) mmconf xbar start (MCFG, q35 only, at 0xb0000000 now). (4) pmbase (FADT, at 0xb000 now).

Especially 3+4 tend to be compile-time constants in the firmware as they are needed very early in the setup process.

So we don't need them in pci-config, just stick constant in ACPI.

I don't want them be constant. I want allow the firmware pick them. Our mmconfig xbar is 256M and can handle 256 busses. I'd like to have the option to reduce that to 64M and place it somewhere else.

Also coreboot and seabios use different values for pmbase. coreboot on q35 maps the pmbase below 0x1000. Which surely makes sense. When we don't place chipset stuff at 0xb000 we can assign the 0xb000->0xbfff window to a pci bridge instead.

Yes, this might be useful. But I don't think it's required to use linker to patch ACPI tables for this - we can write ASL code to read the register back from device configuration, instead.

No, we can't, because the address is in the FADT.

cheers, Gerd

Michael S. Tsirkin

11:52 a.m.

On Thu, Aug 08, 2013 at 10:57:44AM +0200, Gerd Hoffmann wrote:

...

On 08/08/13 10:37, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 09:57:39AM +0200, Gerd Hoffmann wrote:

...
...
...
(3) mmconf xbar start (MCFG, q35 only, at 0xb0000000 now). (4) pmbase (FADT, at 0xb000 now).

Especially 3+4 tend to be compile-time constants in the firmware as they are needed very early in the setup process.

So we don't need them in pci-config, just stick constant in ACPI.

I don't want them be constant. I want allow the firmware pick them. Our mmconfig xbar is 256M and can handle 256 busses. I'd like to have the option to reduce that to 64M and place it somewhere else.

Also coreboot and seabios use different values for pmbase. coreboot on q35 maps the pmbase below 0x1000. Which surely makes sense. When we don't place chipset stuff at 0xb000 we can assign the 0xb000->0xbfff window to a pci bridge instead.

Yes, this might be useful. But I don't think it's required to use linker to patch ACPI tables for this - we can write ASL code to read the register back from device configuration, instead.

No, we can't, because the address is in the FADT.

cheers, Gerd

I see. Yes, PM base is there, this in fact makes it possible to patch it by linker in a sane way. But to make addresses usable to devices they also need to be declared in the _CRS for the PCI root, correct? Which is code in DSDT.

-- MST

Gerd Hoffmann

12:21 p.m.

On 08/08/13 11:52, Michael S. Tsirkin wrote:

...

On Thu, Aug 08, 2013 at 10:57:44AM +0200, Gerd Hoffmann wrote:

...
On 08/08/13 10:37, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 09:57:39AM +0200, Gerd Hoffmann wrote:

...
Also coreboot and seabios use different values for pmbase. coreboot on q35 maps the pmbase below 0x1000. Which surely makes sense. When we don't place chipset stuff at 0xb000 we can assign the 0xb000->0xbfff window to a pci bridge instead.

Yes, this might be useful. But I don't think it's required to use linker to patch ACPI tables for this - we can write ASL code to read the register back from device configuration, instead.

No, we can't, because the address is in the FADT.

cheers, Gerd

I see. Yes, PM base is there, this in fact makes it possible to patch it by linker in a sane way.

Exactly. Likewise the mmconf xbar config in the MCFG table.

...

But to make addresses usable to devices they also need to be declared in the _CRS for the PCI root, correct? Which is code in DSDT.

Yes, the address ranges used for pci devices (aka 32bit + 64bit pci window) need to be there. Well, placing in SSDT, then referencing from DSDT works too, and this is what seabios does today to dynamically adjust stuff. Fixing up the SSDT using the linker is probably easier as we generate it anyway.

cheers, Gerd

Michael S. Tsirkin

4:13 p.m.

On Thu, Aug 08, 2013 at 12:21:32PM +0200, Gerd Hoffmann wrote:

...

On 08/08/13 11:52, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 10:57:44AM +0200, Gerd Hoffmann wrote:

...
On 08/08/13 10:37, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 09:57:39AM +0200, Gerd Hoffmann wrote:

...
Also coreboot and seabios use different values for pmbase. coreboot on q35 maps the pmbase below 0x1000. Which surely makes sense. When we don't place chipset stuff at 0xb000 we can assign the 0xb000->0xbfff window to a pci bridge instead.

Re-reading this - if this has value, can't we generalize it and make all firmware behave the same, getting values from QEMU?

...

...
...
...
Yes, this might be useful. But I don't think it's required to use linker to patch ACPI tables for this - we can write ASL code to read the register back from device configuration, instead.

No, we can't, because the address is in the FADT.

cheers, Gerd

I see. Yes, PM base is there, this in fact makes it possible to patch it by linker in a sane way.

Exactly. Likewise the mmconf xbar config in the MCFG table.

...
But to make addresses usable to devices they also need to be declared in the _CRS for the PCI root, correct? Which is code in DSDT.

Yes, the address ranges used for pci devices (aka 32bit + 64bit pci window) need to be there. Well, placing in SSDT, then referencing from DSDT works too, and this is what seabios does today to dynamically adjust stuff. Fixing up the SSDT using the linker is probably easier as we generate it anyway.

cheers, Gerd

Yes but as I said, this makes things messy, since AML encoding for numbers isn't fixed width.

-- MST

Gerd Hoffmann

4:56 p.m.

On 08/08/13 16:13, Michael S. Tsirkin wrote:

...

On Thu, Aug 08, 2013 at 12:21:32PM +0200, Gerd Hoffmann wrote:

...
On 08/08/13 11:52, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 10:57:44AM +0200, Gerd Hoffmann wrote:

...
On 08/08/13 10:37, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 09:57:39AM +0200, Gerd Hoffmann wrote:

...
Also coreboot and seabios use different values for pmbase. coreboot on q35 maps the pmbase below 0x1000. Which surely makes sense. When we don't place chipset stuff at 0xb000 we can assign the 0xb000->0xbfff window to a pci bridge instead.

Re-reading this - if this has value, can't we generalize it and make all firmware behave the same, getting values from QEMU?

pmbase is a compile-time constant (aka #define) in both seabios and coreboot, and making this runtime-configurable is non-trivial. See src/smm.c in seabios for one reason why.

Moving it to another fixed address below 0x1000 doesn't work, breaks with old qemu+new seabios and other way around.

I think easiest is to allow firmware pick it and fixup the tables provided by qemu accordingly.

Picking pmbase works in firmware today btw as both coreboot+seabios generate a fadt with the correct values. Only nit is that this depends on qemu 1.4+ as piix4 pmbase register was read-only in older qemu versions. So everybody uses 0xb000 by default for bug compatibility with older qemu versions.

...

...
Yes, the address ranges used for pci devices (aka 32bit + 64bit pci window) need to be there. Well, placing in SSDT, then referencing from DSDT works too, and this is what seabios does today to dynamically adjust stuff. Fixing up the SSDT using the linker is probably easier as we generate it anyway.

cheers, Gerd

Yes but as I said, this makes things messy, since AML encoding for numbers isn't fixed width.

In seabios we have fixed 32bit / 64bit width today, from acpi.c:

// store pci io windows *(u32*)&ssdt_ptr[acpi_pci32_start[0]] = cpu_to_le32(pcimem_start); *(u32*)&ssdt_ptr[acpi_pci32_end[0]] = cpu_to_le32(pcimem_end - 1); if (pcimem64_start) { ssdt_ptr[acpi_pci64_valid[0]] = 1; *(u64*)&ssdt_ptr[acpi_pci64_start[0]] = cpu_to_le64(pcimem64_start); *(u64*)&ssdt_ptr[acpi_pci64_end[0]] = cpu_to_le64(pcimem64_end - 1); *(u64*)&ssdt_ptr[acpi_pci64_length[0]] = cpu_to_le64( pcimem64_end - pcimem64_start); } else { ssdt_ptr[acpi_pci64_valid[0]] = 0; }

Storing fixup instructions for these fields in the linker script shouldn't be hard I think.

cheers, Gerd

Kevin O'Connor

9 Aug 9 Aug

6:13 a.m.

On Thu, Aug 08, 2013 at 04:56:55PM +0200, Gerd Hoffmann wrote:

...

On 08/08/13 16:13, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 12:21:32PM +0200, Gerd Hoffmann wrote:

...
On 08/08/13 11:52, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 10:57:44AM +0200, Gerd Hoffmann wrote:

...
On 08/08/13 10:37, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 09:57:39AM +0200, Gerd Hoffmann wrote: > Also coreboot and seabios use different values for pmbase. coreboot on > q35 maps the pmbase below 0x1000. Which surely makes sense. When we > don't place chipset stuff at 0xb000 we can assign the 0xb000->0xbfff > window to a pci bridge instead.

Re-reading this - if this has value, can't we generalize it and make all firmware behave the same, getting values from QEMU?

pmbase is a compile-time constant (aka #define) in both seabios and coreboot, and making this runtime-configurable is non-trivial. See src/smm.c in seabios for one reason why.

I don't think SeaBIOS should modify tables provided by QEMU. That leads to confusion on the source of the data and mixed responsibilities which results in greater complexity in both QEMU and SeaBIOS.

SeaBIOS doesn't have any info that QEMU doesn't have. So, I think it's safe for QEMU to be the sole authority for the table content.

Converting src/smm.c to use a runtime value isn't hard - just change the assembler from: "mov $" __stringify(PORT_ACPI_PM_BASE) " + 0x04, %dx\n" to: "mov 4(my_acpi_base), %dx\n" and make sure to define the global variable my_acpi_base as VARFSEG.

...

...
...
Yes, the address ranges used for pci devices (aka 32bit + 64bit pci window) need to be there. Well, placing in SSDT, then referencing from DSDT works too, and this is what seabios does today to dynamically adjust stuff. Fixing up the SSDT using the linker is probably easier as we generate it anyway.

Yes but as I said, this makes things messy, since AML encoding for numbers isn't fixed width.

In seabios we have fixed 32bit / 64bit width today, from acpi.c:
// store pci io windows
*(u32*)&ssdt_ptr[acpi_pci32_start[0]] = cpu_to_le32(pcimem_start);
*(u32*)&ssdt_ptr[acpi_pci32_end[0]] = cpu_to_le32(pcimem_end - 1);
if (pcimem64_start) {
    ssdt_ptr[acpi_pci64_valid[0]] = 1;
    *(u64*)&ssdt_ptr[acpi_pci64_start[0]] = cpu_to_le64(pcimem64_start);
    *(u64*)&ssdt_ptr[acpi_pci64_end[0]] = cpu_to_le64(pcimem64_end - 1);
    *(u64*)&ssdt_ptr[acpi_pci64_length[0]] = cpu_to_le64(
        pcimem64_end - pcimem64_start);
} else {
    ssdt_ptr[acpi_pci64_valid[0]] = 0;
}
Storing fixup instructions for these fields in the linker script shouldn't be hard I think.

I don't think SeaBIOS should continue to do the above once the tables are moved to QEMU. QEMU has all the info SeaBIOS has, so it can generate the tables correctly on its own.

In all practical situations, the PCI window should be at least an order of magnatude greater than the sum of the PCI bars in the system. If the bars are actually bigger than the window, then things are going to fail - the best the firmware can do is try to fail gracefully. I don't think it's worth the complexity to design mixed ownership and advanced interfaces just so we can fail slightly better.

If this is a real worry, QEMU can sum all the PCI bars and warn the user if they don't fit.

-Kevin

Gerd Hoffmann

8:25 a.m.

Hi,

...

...
pmbase is a compile-time constant (aka #define) in both seabios and coreboot, and making this runtime-configurable is non-trivial. See src/smm.c in seabios for one reason why.

...

Converting src/smm.c to use a runtime value isn't hard - just change the assembler from: "mov $" __stringify(PORT_ACPI_PM_BASE) " + 0x04, %dx\n" to: "mov 4(my_acpi_base), %dx\n" and make sure to define the global variable my_acpi_base as VARFSEG.

Ah, good, I give that a try. Need to check how that works out for coreboot though.

That leaves the mmconf xbar location. We can continue to have everybody agree this should be mapped @ 0xb0000000 and be done with it. Making this configurable via fw_cfg is no problem for seabios. coreboot can't deal with it though, it sets up the xbar _very_ early because it does the complete pci setup via mmconf.

...

...
In seabios we have fixed 32bit / 64bit width today, from acpi.c:
// store pci io windows
*(u32*)&ssdt_ptr[acpi_pci32_start[0]] = cpu_to_le32(pcimem_start);
*(u32*)&ssdt_ptr[acpi_pci32_end[0]] = cpu_to_le32(pcimem_end - 1);
if (pcimem64_start) {
    ssdt_ptr[acpi_pci64_valid[0]] = 1;
    *(u64*)&ssdt_ptr[acpi_pci64_start[0]] = cpu_to_le64(pcimem64_start);
    *(u64*)&ssdt_ptr[acpi_pci64_end[0]] = cpu_to_le64(pcimem64_end - 1);
    *(u64*)&ssdt_ptr[acpi_pci64_length[0]] = cpu_to_le64(
        pcimem64_end - pcimem64_start);
} else {
    ssdt_ptr[acpi_pci64_valid[0]] = 0;
}
Storing fixup instructions for these fields in the linker script shouldn't be hard I think.
I don't think SeaBIOS should continue to do the above once the tables are moved to QEMU. QEMU has all the info SeaBIOS has, so it can generate the tables correctly on its own.

The loader script provided by qemu has fixup instructions, which is needed to fixup pointers to other acpi tables. The idea is to use that mechanism to also allow th firmware to fixup addresses like pmbase in the qemu-generated tables.

cheers, Gerd

Kevin O'Connor

10 Aug 10 Aug

5:10 a.m.

On Fri, Aug 09, 2013 at 08:25:00AM +0200, Gerd Hoffmann wrote:

...

...
I don't think SeaBIOS should continue to do the above once the tables are moved to QEMU. QEMU has all the info SeaBIOS has, so it can generate the tables correctly on its own.

The loader script provided by qemu has fixup instructions, which is needed to fixup pointers to other acpi tables. The idea is to use that mechanism to also allow th firmware to fixup addresses like pmbase in the qemu-generated tables.

Yes, but why should QEMU tell SeaBIOS to modify the table for pmbase when it can just modify the table itself?

-Kevin

Gerd Hoffmann

12 Aug 12 Aug

8:05 a.m.

On 08/10/13 05:10, Kevin O'Connor wrote:

...

On Fri, Aug 09, 2013 at 08:25:00AM +0200, Gerd Hoffmann wrote:

...
...
I don't think SeaBIOS should continue to do the above once the tables are moved to QEMU. QEMU has all the info SeaBIOS has, so it can generate the tables correctly on its own.

The loader script provided by qemu has fixup instructions, which is needed to fixup pointers to other acpi tables. The idea is to use that mechanism to also allow th firmware to fixup addresses like pmbase in the qemu-generated tables.

Yes, but why should QEMU tell SeaBIOS to modify the table for pmbase when it can just modify the table itself?

We'll need some way to make sure the pmbase (also mmconf xbar) set by the firmware matches the pmbase address filled into the acpi tables by qemu ...

So the options we have are:

(1) Hardcode the address everywhere. This is pretty close to the current state, 0xb000 is hard-coded pretty much everywhere, basically because older qemu versions had the pmbase register readonly with 0xb000. I'd like to move the pmbase somewhere else long-term, to free the 0xb000-0xbfff window, so I'd like to avoid that.

(2) Have qemu pick pmbase/xbar addr. Doesn't work due to initialization order issues (especially xbar for coreboot).

(3) Have firmware pick pmbase/xbar, have fixup instructions for the addresses in in the loader script, simliar to the fixup instructions for table-to-table pointers.

(4) [ new idea by mst ] Have firmware pick pmbase/xbar, then have qemu look at the hardware registers programmed by the firmware, use pmbase/xbar addresses found there there when generating the tables.

cheers, Gerd

Kevin O'Connor

13 Aug 13 Aug

12:42 a.m.

On Mon, Aug 12, 2013 at 08:05:08AM +0200, Gerd Hoffmann wrote:

...

We'll need some way to make sure the pmbase (also mmconf xbar) set by the firmware matches the pmbase address filled into the acpi tables by qemu ...

So the options we have are:

(1) Hardcode the address everywhere. This is pretty close to the current state, 0xb000 is hard-coded pretty much everywhere, basically because older qemu versions had the pmbase register readonly with 0xb000. I'd like to move the pmbase somewhere else long-term, to free the 0xb000-0xbfff window, so I'd like to avoid that.

(2) Have qemu pick pmbase/xbar addr. Doesn't work due to initialization order issues (especially xbar for coreboot).

(3) Have firmware pick pmbase/xbar, have fixup instructions for the addresses in in the loader script, simliar to the fixup instructions for table-to-table pointers.

(4) [ new idea by mst ] Have firmware pick pmbase/xbar, then have qemu look at the hardware registers programmed by the firmware, use pmbase/xbar addresses found there there when generating the tables.

I don't much like option 3 or 4.

Although hardcoding (option 1) is ugly, I think that ugliness does not justify the complexity of run-time patching (3/4).

As for option 2 - I don't see why coreboot couldn't read the values out of fw_cfg early on for the handful of cases like this.

-Kevin

Gerd Hoffmann

8:49 a.m.

On Mo, 2013-08-12 at 18:42 -0400, Kevin O'Connor wrote:

...

On Mon, Aug 12, 2013 at 08:05:08AM +0200, Gerd Hoffmann wrote:

...
We'll need some way to make sure the pmbase (also mmconf xbar) set by the firmware matches the pmbase address filled into the acpi tables by qemu ...

So the options we have are:

(1) Hardcode the address everywhere. This is pretty close to the current state, 0xb000 is hard-coded pretty much everywhere, basically because older qemu versions had the pmbase register readonly with 0xb000. I'd like to move the pmbase somewhere else long-term, to free the 0xb000-0xbfff window, so I'd like to avoid that.

(2) Have qemu pick pmbase/xbar addr. Doesn't work due to initialization order issues (especially xbar for coreboot).

(3) Have firmware pick pmbase/xbar, have fixup instructions for the addresses in in the loader script, simliar to the fixup instructions for table-to-table pointers.

(4) [ new idea by mst ] Have firmware pick pmbase/xbar, then have qemu look at the hardware registers programmed by the firmware, use pmbase/xbar addresses found there there when generating the tables.

I don't much like option 3 or 4.

Although hardcoding (option 1) is ugly, I think that ugliness does not justify the complexity of run-time patching (3/4).

Maybe this wasn't clear, but in (4) the table is generated by *qemu* with the values programmed by the firmware.

...

As for option 2 - I don't see why coreboot couldn't read the values out of fw_cfg early on for the handful of cases like this.

Because both mmconf xbar and pmbase are special:

The mmconf xbar is setup as one of the first things coreboot does, even before romstage, then coreboot does the complete pci initialization using mmconf.

pmbase handling depends on southbridge/mainboard code, but it tends to be setup early (in romstage) too. On some boards ram detection needs to fiddle with pmbase registers.

Setting pmbase at runtime looks doable though, even though we might have to go for a temporary location for the pmbase in romstage, then move it to the final place requested by qemu later. But it needs some qemu-specific tweaks in shared southbridge code though as picking pmbase at runtime isn't something which happens on real hardware. I'd like to avoid that if possible.

cheers, Gerd

Kevin O'Connor

14 Aug 14 Aug

2:38 p.m.

On Tue, Aug 13, 2013 at 08:49:21AM +0200, Gerd Hoffmann wrote:

...

On Mo, 2013-08-12 at 18:42 -0400, Kevin O'Connor wrote:

...
On Mon, Aug 12, 2013 at 08:05:08AM +0200, Gerd Hoffmann wrote:

...
We'll need some way to make sure the pmbase (also mmconf xbar) set by the firmware matches the pmbase address filled into the acpi tables by qemu ...

So the options we have are:

(1) Hardcode the address everywhere. This is pretty close to the current state, 0xb000 is hard-coded pretty much everywhere, basically because older qemu versions had the pmbase register readonly with 0xb000. I'd like to move the pmbase somewhere else long-term, to free the 0xb000-0xbfff window, so I'd like to avoid that.

(2) Have qemu pick pmbase/xbar addr. Doesn't work due to initialization order issues (especially xbar for coreboot).

(3) Have firmware pick pmbase/xbar, have fixup instructions for the addresses in in the loader script, simliar to the fixup instructions for table-to-table pointers.

(4) [ new idea by mst ] Have firmware pick pmbase/xbar, then have qemu look at the hardware registers programmed by the firmware, use pmbase/xbar addresses found there there when generating the tables.

I don't much like option 3 or 4.

Although hardcoding (option 1) is ugly, I think that ugliness does not justify the complexity of run-time patching (3/4).

Maybe this wasn't clear, but in (4) the table is generated by *qemu* with the values programmed by the firmware.

Yes. I still don't much like it. I'd think it would be much simpler for qemu to generate the tables once at startup and not have to patch them at runtime. It also introduces an obscure dependency on the ordering of the firmware.

...

...
As for option 2 - I don't see why coreboot couldn't read the values out of fw_cfg early on for the handful of cases like this.

Because both mmconf xbar and pmbase are special:

The mmconf xbar is setup as one of the first things coreboot does, even before romstage, then coreboot does the complete pci initialization using mmconf.

It's not hard to read a value from fw_cfg early on (even in assembler):

outw(FWCFG_MY_EARLY_PORT, PORT_QEMU_CFG_CTL); u8 myval = inb(PORT_QEMU_CFG_DATA);

Also, if this needs to be determined before the ram controller is initialized, then I think it's fine to hard code the value (real machines will almost assuredly hardcode as well).

-Kevin

Gerd Hoffmann

4:52 p.m.

...

...
Maybe this wasn't clear, but in (4) the table is generated by *qemu* with the values programmed by the firmware.

Yes. I still don't much like it. I'd think it would be much simpler for qemu to generate the tables once at startup and not have to patch them at runtime. It also introduces an obscure dependency on the ordering of the firmware.

The ordering isn't a big problem I think as it fits the normal order how the firmware boots up: hardware initialization goes first, generating the tables for the guest comes later on.

...

...
The mmconf xbar is setup as one of the first things coreboot does, even before romstage, then coreboot does the complete pci initialization using mmconf.

It's not hard to read a value from fw_cfg early on (even in assembler):

outw(FWCFG_MY_EARLY_PORT, PORT_QEMU_CFG_CTL); u8 myval = inb(PORT_QEMU_CFG_DATA);

Well, only if the stuff isn't in a fw_cfg file, which happens to be the case for all new stuff we are adding. Also I'm not sure I can easily make the xbar location a runtime variable in coreboot.

...

Also, if this needs to be determined before the ram controller is initialized, then I think it's fine to hard code the value (real machines will almost assuredly hardcode as well).

Yea, real machines hardcode that. As mentioned for coreboot this is a compile-time constant (aka #define), which makes sense of course. With firmware generating the acpi tables this works fine, it just uses the #defined value when writing the tables.

With qemu generating the acpi tables we can hardcode it to the same value in both firmware and qemu. Which pretty much implies we can never ever change it in the future without breaking stuff. Thats why I'd like avoid that in the first place.

Making qemu lookup the values in the hardware registers, after the firmware programmed them, has the big advantage that we don't need any new paravirtual interfaces to have firmware and qemu agree on the values ...

cheers, Gerd

Gerd Hoffmann

9 Aug 9 Aug

11:45 a.m.

Hi,

...

Converting src/smm.c to use a runtime value isn't hard - just change the assembler from: "mov $" __stringify(PORT_ACPI_PM_BASE) " + 0x04, %dx\n" to: "mov 4(my_acpi_base), %dx\n" and make sure to define the global variable my_acpi_base as VARFSEG.

The apm fix brought a ctl register variable we can use directly, so I tried the attached patch, then got this:

Linking out/rom.o out/code32flat.o: In function `smm_relocation_end': (.text.asm./home/kraxel/projects/seabios/src/smm.c.72+0x37): relocation truncated to fit: R_386_16 against symbol `acpi_pm1a_cnt' defined in .data.varfseg./home/kraxel/projects/seabios/src/acpi.c.21 section in out/code32flat.o out/code32flat.o: In function `smm_relocation_end': (.text.asm./home/kraxel/projects/seabios/src/smm.c.72+0x46): relocation truncated to fit: R_386_16 against symbol `acpi_pm1a_cnt' defined in .data.varfseg./home/kraxel/projects/seabios/src/acpi.c.21 section in out/code32flat.o make: *** [out/rom.o] Error 1

cheers, Gerd

Kevin O'Connor

10 Aug 10 Aug

5:30 a.m.

On Fri, Aug 09, 2013 at 11:45:59AM +0200, Gerd Hoffmann wrote:

...

Hi,

...
Converting src/smm.c to use a runtime value isn't hard - just change the assembler from: "mov $" __stringify(PORT_ACPI_PM_BASE) " + 0x04, %dx\n" to: "mov 4(my_acpi_base), %dx\n" and make sure to define the global variable my_acpi_base as VARFSEG.

The apm fix brought a ctl register variable we can use directly, so I tried the attached patch, then got this:

Linking out/rom.o out/code32flat.o: In function `smm_relocation_end': (.text.asm./home/kraxel/projects/seabios/src/smm.c.72+0x37): relocation truncated to fit: R_386_16 against symbol `acpi_pm1a_cnt' defined in .data.varfseg./home/kraxel/projects/seabios/src/acpi.c.21 section in out/code32flat.o out/code32flat.o: In function `smm_relocation_end': (.text.asm./home/kraxel/projects/seabios/src/smm.c.72+0x46): relocation truncated to fit: R_386_16 against symbol `acpi_pm1a_cnt' defined in .data.varfseg./home/kraxel/projects/seabios/src/acpi.c.21 section in out/code32flat.o make: *** [out/rom.o] Error 1

Use "addr32 movw (acpi_pm1a_cnt), %dx" instead of "mov (acpi_pm1a_cnt), %dx".

That said, having the smi handler read from the f-segment probably isn't the best thing to do (though it should work). Better would be to make a variable between smm_code_start/end and copy the value in as part of the smm init, though admittedly that is a bit more complicated.

-Kevin

Kevin O'Connor

5:50 p.m.

On Fri, Aug 09, 2013 at 11:30:14PM -0400, Kevin O'Connor wrote:

...

On Fri, Aug 09, 2013 at 11:45:59AM +0200, Gerd Hoffmann wrote:

...
Hi,

...
Converting src/smm.c to use a runtime value isn't hard - just change the assembler from: "mov $" __stringify(PORT_ACPI_PM_BASE) " + 0x04, %dx\n" to: "mov 4(my_acpi_base), %dx\n" and make sure to define the global variable my_acpi_base as VARFSEG.

The apm fix brought a ctl register variable we can use directly, so I tried the attached patch, then got this:

Linking out/rom.o out/code32flat.o: In function `smm_relocation_end': (.text.asm./home/kraxel/projects/seabios/src/smm.c.72+0x37): relocation truncated to fit: R_386_16 against symbol `acpi_pm1a_cnt' defined in .data.varfseg./home/kraxel/projects/seabios/src/acpi.c.21 section in out/code32flat.o out/code32flat.o: In function `smm_relocation_end': (.text.asm./home/kraxel/projects/seabios/src/smm.c.72+0x46): relocation truncated to fit: R_386_16 against symbol `acpi_pm1a_cnt' defined in .data.varfseg./home/kraxel/projects/seabios/src/acpi.c.21 section in out/code32flat.o make: *** [out/rom.o] Error 1

Use "addr32 movw (acpi_pm1a_cnt), %dx" instead of "mov (acpi_pm1a_cnt), %dx".

I ran a quick test and the two attached patches seem to work okay.

-Kevin

Michael S. Tsirkin

9 Aug 9 Aug

5:49 p.m.

On Fri, Aug 09, 2013 at 12:13:06AM -0400, Kevin O'Connor wrote:

...

On Thu, Aug 08, 2013 at 04:56:55PM +0200, Gerd Hoffmann wrote:

...
On 08/08/13 16:13, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 12:21:32PM +0200, Gerd Hoffmann wrote:

...
On 08/08/13 11:52, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 10:57:44AM +0200, Gerd Hoffmann wrote:

...
On 08/08/13 10:37, Michael S. Tsirkin wrote: > On Thu, Aug 08, 2013 at 09:57:39AM +0200, Gerd Hoffmann wrote: >> Also coreboot and seabios use different values for pmbase. coreboot on >> q35 maps the pmbase below 0x1000. Which surely makes sense. When we >> don't place chipset stuff at 0xb000 we can assign the 0xb000->0xbfff >> window to a pci bridge instead.

Re-reading this - if this has value, can't we generalize it and make all firmware behave the same, getting values from QEMU?

pmbase is a compile-time constant (aka #define) in both seabios and coreboot, and making this runtime-configurable is non-trivial. See src/smm.c in seabios for one reason why.

I don't think SeaBIOS should modify tables provided by QEMU. That leads to confusion on the source of the data and mixed responsibilities which results in greater complexity in both QEMU and SeaBIOS.

SeaBIOS doesn't have any info that QEMU doesn't have. So, I think it's safe for QEMU to be the sole authority for the table content.

Converting src/smm.c to use a runtime value isn't hard - just change the assembler from: "mov $" __stringify(PORT_ACPI_PM_BASE) " + 0x04, %dx\n" to: "mov 4(my_acpi_base), %dx\n" and make sure to define the global variable my_acpi_base as VARFSEG.

...
...
...
Yes, the address ranges used for pci devices (aka 32bit + 64bit pci window) need to be there. Well, placing in SSDT, then referencing from DSDT works too, and this is what seabios does today to dynamically adjust stuff. Fixing up the SSDT using the linker is probably easier as we generate it anyway.

Yes but as I said, this makes things messy, since AML encoding for numbers isn't fixed width.

In seabios we have fixed 32bit / 64bit width today, from acpi.c:
// store pci io windows
*(u32*)&ssdt_ptr[acpi_pci32_start[0]] = cpu_to_le32(pcimem_start);
*(u32*)&ssdt_ptr[acpi_pci32_end[0]] = cpu_to_le32(pcimem_end - 1);
if (pcimem64_start) {
    ssdt_ptr[acpi_pci64_valid[0]] = 1;
    *(u64*)&ssdt_ptr[acpi_pci64_start[0]] = cpu_to_le64(pcimem64_start);
    *(u64*)&ssdt_ptr[acpi_pci64_end[0]] = cpu_to_le64(pcimem64_end - 1);
    *(u64*)&ssdt_ptr[acpi_pci64_length[0]] = cpu_to_le64(
        pcimem64_end - pcimem64_start);
} else {
    ssdt_ptr[acpi_pci64_valid[0]] = 0;
}
Storing fixup instructions for these fields in the linker script shouldn't be hard I think.
I don't think SeaBIOS should continue to do the above once the tables are moved to QEMU. QEMU has all the info SeaBIOS has, so it can generate the tables correctly on its own.

In all practical situations, the PCI window should be at least an order of magnatude greater than the sum of the PCI bars in the system. If the bars are actually bigger than the window, then things are going to fail - the best the firmware can do is try to fail gracefully. I don't think it's worth the complexity to design mixed ownership and advanced interfaces just so we can fail slightly better.

If this is a real worry, QEMU can sum all the PCI bars and warn the user if they don't fit.

-Kevin

If we make it a rule that PCI is`setup before ACPI tables are read, then QEMU can do the patching itself when it detects BIOS reading the tables.

Gerd, Laszlo,others, does this rule work for alternative firmwares?

Kevin O'Connor

10 Aug 10 Aug

5:06 a.m.

On Fri, Aug 09, 2013 at 06:49:18PM +0300, Michael S. Tsirkin wrote:

...

On Fri, Aug 09, 2013 at 12:13:06AM -0400, Kevin O'Connor wrote:

...
I don't think SeaBIOS should continue to do the above once the tables are moved to QEMU. QEMU has all the info SeaBIOS has, so it can generate the tables correctly on its own.

In all practical situations, the PCI window should be at least an order of magnatude greater than the sum of the PCI bars in the system. If the bars are actually bigger than the window, then things are going to fail - the best the firmware can do is try to fail gracefully. I don't think it's worth the complexity to design mixed ownership and advanced interfaces just so we can fail slightly better.

If this is a real worry, QEMU can sum all the PCI bars and warn the user if they don't fit.

If we make it a rule that PCI is`setup before ACPI tables are read, then QEMU can do the patching itself when it detects BIOS reading the tables.

QEMU can do the patching regardless of the order of initialization. The PCI bar sizes are static - they have no relationship to the timing of the BIOS PCI initialization.

-Kevin

Gerd Hoffmann

12 Aug 12 Aug

8:37 a.m.

Hi,

...

If we make it a rule that PCI is`setup before ACPI tables are read, then QEMU can do the patching itself when it detects BIOS reading the tables.

Approach makes sense to me. The ordering constrain shouldn't be a big burden, hardware detection+bringup (including pci setup) is the first thing done by the firmware, loading/generating acpi tables is one of the last things. And it avoids the need to communicate the addresses (or patch locations) between qemu+firmware.

What do you want to use this for? pmbase and xbar are simple, they are just a single register read. pci io windows needs a root bus scan, but should be doable too.

...

Gerd, Laszlo,others, does this rule work for alternative firmwares?

It surely works for coreboot, and I would be very surprised if this causes trouble for ovmf.

cheers, Gerd

Michael S. Tsirkin

9:56 a.m.

On Mon, Aug 12, 2013 at 08:37:00AM +0200, Gerd Hoffmann wrote:

...

Hi,

...
If we make it a rule that PCI is`setup before ACPI tables are read, then QEMU can do the patching itself when it detects BIOS reading the tables.

Approach makes sense to me. The ordering constrain shouldn't be a big burden, hardware detection+bringup (including pci setup) is the first thing done by the firmware, loading/generating acpi tables is one of the last things. And it avoids the need to communicate the addresses (or patch locations) between qemu+firmware.

What do you want to use this for? pmbase and xbar are simple, they are just a single register read. pci io windows needs a root bus scan, but should be doable too.

Right. We'll need to migrate the offsets for patching since they are tied to specific AML and this can change.

...

...
Gerd, Laszlo,others, does this rule work for alternative firmwares?

It surely works for coreboot, and I would be very surprised if this causes trouble for ovmf.

cheers, Gerd

Laszlo Ersek

3:08 p.m.

On 08/09/13 17:49, Michael S. Tsirkin wrote:

...

On Fri, Aug 09, 2013 at 12:13:06AM -0400, Kevin O'Connor wrote:

...
On Thu, Aug 08, 2013 at 04:56:55PM +0200, Gerd Hoffmann wrote:

...
On 08/08/13 16:13, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 12:21:32PM +0200, Gerd Hoffmann wrote:

...
On 08/08/13 11:52, Michael S. Tsirkin wrote:

...
On Thu, Aug 08, 2013 at 10:57:44AM +0200, Gerd Hoffmann wrote: > On 08/08/13 10:37, Michael S. Tsirkin wrote: >> On Thu, Aug 08, 2013 at 09:57:39AM +0200, Gerd Hoffmann wrote: >>> Also coreboot and seabios use different values for pmbase. coreboot on >>> q35 maps the pmbase below 0x1000. Which surely makes sense. When we >>> don't place chipset stuff at 0xb000 we can assign the 0xb000->0xbfff >>> window to a pci bridge instead.

Re-reading this - if this has value, can't we generalize it and make all firmware behave the same, getting values from QEMU?

pmbase is a compile-time constant (aka #define) in both seabios and coreboot, and making this runtime-configurable is non-trivial. See src/smm.c in seabios for one reason why.

I don't think SeaBIOS should modify tables provided by QEMU. That leads to confusion on the source of the data and mixed responsibilities which results in greater complexity in both QEMU and SeaBIOS.

SeaBIOS doesn't have any info that QEMU doesn't have. So, I think it's safe for QEMU to be the sole authority for the table content.

Converting src/smm.c to use a runtime value isn't hard - just change the assembler from: "mov $" __stringify(PORT_ACPI_PM_BASE) " + 0x04, %dx\n" to: "mov 4(my_acpi_base), %dx\n" and make sure to define the global variable my_acpi_base as VARFSEG.

...
...
...
Yes, the address ranges used for pci devices (aka 32bit + 64bit pci window) need to be there. Well, placing in SSDT, then referencing from DSDT works too, and this is what seabios does today to dynamically adjust stuff. Fixing up the SSDT using the linker is probably easier as we generate it anyway.

Yes but as I said, this makes things messy, since AML encoding for numbers isn't fixed width.

In seabios we have fixed 32bit / 64bit width today, from acpi.c:
// store pci io windows
*(u32*)&ssdt_ptr[acpi_pci32_start[0]] = cpu_to_le32(pcimem_start);
*(u32*)&ssdt_ptr[acpi_pci32_end[0]] = cpu_to_le32(pcimem_end - 1);
if (pcimem64_start) {
    ssdt_ptr[acpi_pci64_valid[0]] = 1;
    *(u64*)&ssdt_ptr[acpi_pci64_start[0]] = cpu_to_le64(pcimem64_start);
    *(u64*)&ssdt_ptr[acpi_pci64_end[0]] = cpu_to_le64(pcimem64_end - 1);
    *(u64*)&ssdt_ptr[acpi_pci64_length[0]] = cpu_to_le64(
        pcimem64_end - pcimem64_start);
} else {
    ssdt_ptr[acpi_pci64_valid[0]] = 0;
}
Storing fixup instructions for these fields in the linker script shouldn't be hard I think.
I don't think SeaBIOS should continue to do the above once the tables are moved to QEMU. QEMU has all the info SeaBIOS has, so it can generate the tables correctly on its own.

In all practical situations, the PCI window should be at least an order of magnatude greater than the sum of the PCI bars in the system. If the bars are actually bigger than the window, then things are going to fail - the best the firmware can do is try to fail gracefully. I don't think it's worth the complexity to design mixed ownership and advanced interfaces just so we can fail slightly better.

If this is a real worry, QEMU can sum all the PCI bars and warn the user if they don't fit.

-Kevin
If we make it a rule that PCI is`setup before ACPI tables are read, then QEMU can do the patching itself when it detects BIOS reading the tables.

Gerd, Laszlo,others, does this rule work for alternative firmwares?

OVMF currently determines the boundaries of the PCI window (32-bit) that it is going to advertise in the _CRS method by scanning the "map of the memory resources in the global coherency domain of the processor".

It finds the top of the system memory (under 4GB) and the smallest interval that covers all of the memory mapped IO ranges. The subset of the latter interval that is above the top of the system memory is then advertised.

See PopulateFwData() in "OvmfPkg/AcpiPlatformDxe/Qemu.c", and GetMemorySpaceMap() in "VOLUME 2: Platform Initialization Specification / Driver Execution Environment Core Interface".

... With this I'm trying to say that for now OVMF keys the 32-bit window it advertises off regions that are "currently being decoded by a component as memory-mapped I/O that can be used to access I/O devices in the platform".

This seems to imply that PCI setup precedes ACPI table installation.

Laszlo

Michael S. Tsirkin

8 Aug 8 Aug

8:04 a.m.

On Wed, Aug 07, 2013 at 09:29:39AM +0200, Gerd Hoffmann wrote:

...

We'll need some way for seabios to fill in the pci window information into the qemu-provided tables. Easiest way to do that would be to extend the COMMAND_ADD_POINTER bios linker script command.

This idea certainly has an advantage: the two patch-sets (to control PCI hole from QEMU, and to pass ACPI tables from QEMU) would become independent.

One difficulty would be coming up with a sane interface that's not tied to specific AML code: unlike table pointers which have a specific fixed-width format, we are talking about generic AML code here. Patching that works (we do it today with the ACPI_EXTRACT code) but requires that you code AML in a specific way, for example, number encoding is variable-length so we pad values ahead of the time making sure the AML encoding is large enough to hold the maximum possible value.

Also, in the past both Anthony and Kevin indicated preference to the pci-info solution that we have in QEMU today.

-- MST

4286

days inactive

4301

days old

seabios@seabios.org

51 comments

7 participants

tags (0)

participants (7)

Fred .
Gerd Hoffmann
Igor Mammedov
Kevin O'Connor
Laszlo Ersek
Michael S. Tsirkin
Paolo Bonzini