Do you have the failed DSDT table dumped? Even there's recent change around NVSA, but looks that's different.

Raul Rangel <rrangel@chromium.org> 于2021年1月21日周四 上午4:24写道:
Over the weekend I had the realization that SMI logging was enabled
and interfering with WinDbg. Once I flashed a non-serial firmware
WinDbg became a lot more stable and I was able to reliably attach to
the boot loader debugger i.e., `/bootdebug {default}`. The OS debugger
(`/debug {default} on`) was still not functioning though. I wasn't
sure If the BSOD was happening in the boot loader or the OS kernel, so
I stepped through the boot loader until I saw it jump to the OS. From
there WinDbg failed to restore the connection. The exception happens
before the OS is capable of writing a kernel dump. I was also
suspecting it was happening before the debugger was set up since the
connection could not be re-established.

I then saw Felix's reply:
> To decode the bug check values and their parameters, see https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/bug-check-0xa5--acpi-bios-error

> The third parameter you posted decodes to _UID (that one is 4 char ASCII stored as little endian number).

The parameters I gathered last week:

> 0x0000000000000000
> OxFFFFD38AC66EC7FO
> Ox000000004449555F
> 0x0000000000000000

The first parameter 0x0 wasn't listed in the table. So this led me to
believe that maybe there was a problem parsing the ACPI tables. Felix
suggested I use the [Microsoft ASL
compiler](https://docs.microsoft.com/en-us/windows-hardware/drivers/bringup/microsoft-asl-compiler)
to decompile the AML and verify if the tables were valid.

I used linux to dump the ACPI tables via `/sys/firmware/acpi/tables/`,
added the `.AML` suffix, and ran `asl.exe /u DSDT.AML`. This printed
an error saying `NVSA was already defined`. Using iasl to decompile
the table I saw the following:

```
External (NVSA)
Name (NVSA, 0xCA6B2000)
OperationRegion (GNVS, SystemMemory, NVSA, 0x1000)
```

The external reference was defined in `.asl`:
https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/coreboot/src/soc/amd/picasso/acpi/globalnvs.asl;l=12
The `Name` node was created by acpigen:
https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/coreboot/src/soc/amd/picasso/acpi.c;l=429

Removing the External from the `.asl`. results in the `iasl` compiler
complaining about a missing reference. So I move the `Name` node to
the SSDT table. This resulted in linux complaining that `GVNS` was
invalid because it couldn't find `NVSA`. The ACPI spec says the
following:

> OperationRegion (RegionName, RegionSpace, Offset, Length)
> Operation regions are regions in some space that contain hardware registers for exclusive use by ACPI
control methods. ...
> The entire Operation Region can be allocated for exclusive use to the ACPI subsystem in the host OS.
> Operation Regions that are defined within the scope of a method are the exception to this rule. These Operation Regions are known as “Dynamic” since the OS has no idea that they exist or what registers they use until the control method is executed.

I'm guessing that we can't move the `NVSA` node to the SSDT because
the value is required when instantiating `OperationRegion`.

I'm not quite sure how to solve this. For now I just hard coded the
address in the `OperationRegion`. I'm open to suggestions.

The second problem was `OIPG`. It is also defined twice:

https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/coreboot/src/vendorcode/google/chromeos/acpi/chromeos.asl;l=8

https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/coreboot/src/vendorcode/google/chromeos/acpi.c;l=15

Changing the callback to write to the SSDT table fixed that problem
and I was finally able to decode `DSDT.AML` using the `Microsoft ASL
Compiler`.

I then disabled `/bootdebug` and `/debug` since they weren't providing
any value and were preventing me from seeing the BSOD error codes. One
thing I noticed was that rebooting after a BSOD the boot loader would
boot a "system restore" image. This image used a different registry
than the OS so the error codes were not printed on the screen.
Rebooting again the boot loader would load the OS. So each test
required a double reboot. I'm also using Tianocore to boot Windows,
which is super slow...

The BSOD this time looked identical to the previous one... But upon
closer inspection the error code was different:

> 0x000000000000000D

... I went back and looked at the photo I took of the original BSOD
and it was indeed 0x000000000000000D! The font made this easy to miss
the first time around. Google Lens didn't pick it up either.

The exception now made sense:

> ACPI could not find a required method or object in the namespace This bug check code is used if there is no _HID or _ADR present.

and to re quote Felix:

> The third parameter you posted decodes to _UID (that one is 4 char ASCII stored as little endian number).

So a device was missing a _UID. I manually audited all the Device
nodes in the DSDT and SSDT and indeed we had devices that were missing
`_UID` and some devices that even had duplicate `_UID`s. When I fixes
this I got a new BSOD:

> 0x06 - ACPI tried to find a named object, but it could not find the object.
> 0x<some pointer>
> 0x<some pointer>
> 0x<some pointer>

This was discouraging since I didn't have a way of dereferencing the
pointers. I decided to double check the `SPCR` and the `DBG2` tables.
The `DBG2` table was using `MMIO` while the `SPCR` was using `IO`. So
I switched it over to `IO` since I knew that worked, set `/debug
{default} on` and voila OS kernel debugger!

Doing `!analyze -v` showed the error and parameters. It did lock up
trying to print the details section. Hitting the `Break` button
cancelled the operation and I was able to continue. A simple `!nsobj
<pointer>` showed that the FUR0 power resource wasn't being found:
https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/coreboot/src/soc/amd/picasso/acpi/sb_fch.asl;l=169;drc=18593759918afe7ed67c097d444be7555575f50e
I suspect it's because the `AOAC`
[node](https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/coreboot/src/soc/amd/picasso/acpi/aoac.asl;l=126)
is defined as a bridge device. I commented out all the `AOAC` and
power resources and was then greeted with:

> 0x1000D -  _PRW specified with no wake-capable interrupts and at least one GPIO interrupt
> A device used both GPE and GPIO interrupts, which is not supported.

If I understand it correctly, that means that we can't use a GPE in
the _PRW and a GPIO in the _CRS.

i.e.,
    Device (CRFP)
    {
        Name (_HID, "PRP0001")  // _HID: Hardware ID
        Name (_UID, Zero)  // _UID: Unique ID
        Name (_DDN, "Fingerprint Reader")  // _DDN: DOS Device Name

        Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
        {
            UartSerialBusV2 (0x002DC6C0, DataBitsEight, StopBitsOne,
                0x00, LittleEndian, ParityTypeNone, FlowControlNone,
                0x0040, 0x0040, "\\_SB.FUR1",
                0x00, ResourceConsumer, , Exclusive,
                )
            GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000,
                "\\_SB.GPIO", 0x00, ResourceConsumer, ,
                )
                {   // Pin list
                    0x0006
                }
        })
        Name (_S0W, 0x04)  // _S0W: S0 Device Wake State
        Name (_PRW, Package (0x02)  // _PRW: Power Resources for Wake
        {
            0x0A,
            0x03
        })
    }

I'm guessing the _PRW needs to reference the GPIO controller, and that
controller must have an `_AEI` defining the pin. Not really sure why
Windows has a problem with mixed event types. For now I just commented
out all the I2C and UART peripherals.

With all that I was finally able to boot into Windows!

Now on to making some CLs and fixing the remaining issues.


On Fri, Jan 15, 2021 at 5:52 PM Felix Held <felix-coreboot@felixheld.de> wrote:
>
> Forgot to add that to find out what the cause is the easiest way is
> probably having the installed image configured in a way that it'll write
> full kernel memory dumps to disk and then use !analyze -v in WinDbg on
> that generated kernel dump. At least that's what I remember from more
> than 1.5 years ago, so some of the info might not be 100% accurate.
>
> Regards,
> Felix