I'm trying to boot the Windows 10 Installer on a picasso based device using coreboot + tianocore. I keep getting a BSOD after the windows logo shows with the very descriptive stop code `ACPI BIOS ERROR`.
I've enabled bootdebug on the USB stick using the following:
bcdedit /store H:\boot\bcd /bootdebug {bootmgr} on bcdedit /store H:\boot\bcd /bootdebug {default} on bcdedit /store H:\boot\bcd /debug {debug} on
Here is the BCD:
C:\Windows\system32>bcdedit /store h:\boot\bcd Windows Boot Manager -------------------- identifier {bootmgr} description Windows Boot Manager locale en-US inherit {globalsettings} bootdebug Yes default {default} displayorder {default} toolsdisplayorder {memdiag} timeout 30
Windows Boot Loader ------------------- identifier {default} device ramdisk=[boot]\sources\boot.wim,{7619dcc8-fafe-11d9-b411-000476eba25f} path \windows\system32\boot\winload.exe description Windows Setup locale en-US inherit {bootloadersettings} bootdebug Yes osdevice ramdisk=[boot]\sources\boot.wim,{7619dcc8-fafe-11d9-b411-000476eba25f} systemroot \windows bootmenupolicy Standard detecthal Yes winpe Yes debug Yes ems No
C:\Windows\system32>bcdedit /store h:\boot\bcd /dbgsettings debugtype Serial debugport 1 baudrate 115200
I have also added the SPCR table:
[000h 0000 4] Signature : "SPCR" [Serial Port Console Redirection table] [004h 0004 4] Table Length : 00000050 [008h 0008 1] Revision : 02 [009h 0009 1] Checksum : F1 [00Ah 0010 6] Oem ID : "COREv4" [010h 0016 8] Oem Table ID : "COREBOOT" [018h 0024 4] Oem Revision : 0000002A [01Ch 0028 4] Asl Compiler ID : "CORE" [020h 0032 4] Asl Compiler Revision : 20200925
[024h 0036 1] Interface Type : 00 [025h 0037 3] Reserved : 000000
[028h 0040 12] Serial Port Register : [Generic Address Structure] [028h 0040 1] Space ID : 00 [SystemMemory] [029h 0041 1] Bit Width : 20 [02Ah 0042 1] Bit Offset : 00 [02Bh 0043 1] Encoded Access Width : 03 [DWord Access:32] [02Ch 0044 8] Address : 00000000FEDC9000
[034h 0052 1] Interrupt Type : 03 [035h 0053 1] PCAT-compatible IRQ : 04 [036h 0054 4] Interrupt : 00000004 [03Ah 0058 1] Baud Rate : 00 [03Bh 0059 1] Parity : 00 [03Ch 0060 1] Stop Bits : 00 [03Dh 0061 1] Flow Control : 00 [03Eh 0062 1] Terminal Type : 00 [04Ch 0076 1] Reserved : 00 [040h 0064 2] PCI Device ID : FFFF [042h 0066 2] PCI Vendor ID : FFFF [044h 0068 1] PCI Bus : 00 [045h 0069 1] PCI Device : 00 [046h 0070 1] PCI Function : 00 [047h 0071 4] PCI Flags : 00000000 [04Bh 0075 1] PCI Segment : 00 [04Ch 0076 4] Reserved : 00000000
And the DBG2 table:
[000h 0000 4] Signature : "DBG2" [Debug Port table type 2] [004h 0004 4] Table Length : 0000005C [008h 0008 1] Revision : 00 [009h 0009 1] Checksum : 78 [00Ah 0010 6] Oem ID : "COREv4" [010h 0016 8] Oem Table ID : "COREBOOT" [018h 0024 4] Oem Revision : 00000000 [01Ch 0028 4] Asl Compiler ID : "CORE" [020h 0032 4] Asl Compiler Revision : 20200925
[024h 0036 4] Info Offset : 0000002C [028h 0040 4] Info Count : 00000001
[02Ch 0044 1] Revision : 00 [02Dh 0045 2] Length : 0030 [02Fh 0047 1] Register Count : 01 [030h 0048 2] Namepath Length : 000A [032h 0050 2] Namepath Offset : 0026 [034h 0052 2] OEM Data Length : 0000 [Optional field not present] [036h 0054 2] OEM Data Offset : 0000 [Optional field not present] [038h 0056 2] Port Type : 8000 [03Ah 0058 2] Port Subtype : 0000 [03Ch 0060 2] Reserved : 0000 [03Eh 0062 2] Base Address Offset : 0016 [040h 0064 2] Address Size Offset : 0022
[042h 0066 12] Base Address Register : [Generic Address Structure] [042h 0066 1] Space ID : 00 [SystemMemory] [043h 0067 1] Bit Width : 20 [044h 0068 1] Bit Offset : 00 [045h 0069 1] Encoded Access Width : 03 [DWord Access:32] [046h 0070 8] Address : 00000000FEDC9000
[04Eh 0078 4] Address Size : 000000F8
[052h 0082 10] Namepath : "_SB.FUR0"
Given all this I have yet to see windows dump anything to the serial console. The device just reboots after the BSOD message. Does anyone have any experience debugging Windows boot errors? Can you share any tips? :)
Thanks, Raul
Stoney has the same issue, pretty sure it's related to a memory address range being incorrectly marked or something similar (based on the DWORD output of the BSOD), but never bothered to troubleshoot
On Wed, Jan 13, 2021 at 4:21 PM Raul Rangel rrangel@chromium.org wrote:
I'm trying to boot the Windows 10 Installer on a picasso based device using coreboot + tianocore. I keep getting a BSOD after the windows logo shows with the very descriptive stop code `ACPI BIOS ERROR`.
I've enabled bootdebug on the USB stick using the following:
bcdedit /store H:\boot\bcd /bootdebug {bootmgr} on bcdedit /store H:\boot\bcd /bootdebug {default} on bcdedit /store H:\boot\bcd /debug {debug} on
Here is the BCD:
C:\Windows\system32>bcdedit /store h:\boot\bcd Windows Boot Manager -------------------- identifier {bootmgr} description Windows Boot Manager locale en-US inherit {globalsettings} bootdebug Yes default {default} displayorder {default} toolsdisplayorder {memdiag} timeout 30 Windows Boot Loader ------------------- identifier {default} device ramdisk=[boot]\sources\boot.wim,{7619dcc8-fafe-11d9-b411-000476eba25f} path \windows\system32\boot\winload.exe description Windows Setup locale en-US inherit {bootloadersettings} bootdebug Yes osdevice ramdisk=[boot]\sources\boot.wim,{7619dcc8-fafe-11d9-b411-000476eba25f} systemroot \windows bootmenupolicy Standard detecthal Yes winpe Yes debug Yes ems No C:\Windows\system32>bcdedit /store h:\boot\bcd /dbgsettings debugtype Serial debugport 1 baudrate 115200
I have also added the SPCR table:
[000h 0000 4] Signature : "SPCR" [Serial Port Console Redirection table] [004h 0004 4] Table Length : 00000050 [008h 0008 1] Revision : 02 [009h 0009 1] Checksum : F1 [00Ah 0010 6] Oem ID : "COREv4" [010h 0016 8] Oem Table ID : "COREBOOT" [018h 0024 4] Oem Revision : 0000002A [01Ch 0028 4] Asl Compiler ID : "CORE" [020h 0032 4] Asl Compiler Revision : 20200925 [024h 0036 1] Interface Type : 00 [025h 0037 3] Reserved : 000000 [028h 0040 12] Serial Port Register : [Generic Address Structure] [028h 0040 1] Space ID : 00 [SystemMemory] [029h 0041 1] Bit Width : 20 [02Ah 0042 1] Bit Offset : 00 [02Bh 0043 1] Encoded Access Width : 03 [DWord Access:32] [02Ch 0044 8] Address : 00000000FEDC9000 [034h 0052 1] Interrupt Type : 03 [035h 0053 1] PCAT-compatible IRQ : 04 [036h 0054 4] Interrupt : 00000004 [03Ah 0058 1] Baud Rate : 00 [03Bh 0059 1] Parity : 00 [03Ch 0060 1] Stop Bits : 00 [03Dh 0061 1] Flow Control : 00 [03Eh 0062 1] Terminal Type : 00 [04Ch 0076 1] Reserved : 00 [040h 0064 2] PCI Device ID : FFFF [042h 0066 2] PCI Vendor ID : FFFF [044h 0068 1] PCI Bus : 00 [045h 0069 1] PCI Device : 00 [046h 0070 1] PCI Function : 00 [047h 0071 4] PCI Flags : 00000000 [04Bh 0075 1] PCI Segment : 00 [04Ch 0076 4] Reserved : 00000000
And the DBG2 table:
[000h 0000 4] Signature : "DBG2" [Debug Port table type 2] [004h 0004 4] Table Length : 0000005C [008h 0008 1] Revision : 00 [009h 0009 1] Checksum : 78 [00Ah 0010 6] Oem ID : "COREv4" [010h 0016 8] Oem Table ID : "COREBOOT" [018h 0024 4] Oem Revision : 00000000 [01Ch 0028 4] Asl Compiler ID : "CORE" [020h 0032 4] Asl Compiler Revision : 20200925 [024h 0036 4] Info Offset : 0000002C [028h 0040 4] Info Count : 00000001 [02Ch 0044 1] Revision : 00 [02Dh 0045 2] Length : 0030 [02Fh 0047 1] Register Count : 01 [030h 0048 2] Namepath Length : 000A [032h 0050 2] Namepath Offset : 0026 [034h 0052 2] OEM Data Length : 0000 [Optional field not present] [036h 0054 2] OEM Data Offset : 0000 [Optional field not present] [038h 0056 2] Port Type : 8000 [03Ah 0058 2] Port Subtype : 0000 [03Ch 0060 2] Reserved : 0000 [03Eh 0062 2] Base Address Offset : 0016 [040h 0064 2] Address Size Offset : 0022 [042h 0066 12] Base Address Register : [Generic Address Structure] [042h 0066 1] Space ID : 00 [SystemMemory] [043h 0067 1] Bit Width : 20 [044h 0068 1] Bit Offset : 00 [045h 0069 1] Encoded Access Width : 03 [DWord Access:32] [046h 0070 8] Address : 00000000FEDC9000 [04Eh 0078 4] Address Size : 000000F8 [052h 0082 10] Namepath : "\_SB.FUR0"
Given all this I have yet to see windows dump anything to the serial console. The device just reboots after the BSOD message. Does anyone have any experience debugging Windows boot errors? Can you share any tips? :)
Thanks, Raul _______________________________________________ coreboot mailing list -- coreboot@coreboot.org To unsubscribe send an email to coreboot-leave@coreboot.org
Highly possible you don't need to connect live sessions using windbg, you can analysis the generated dump file to simply open with windbg.
Raul Rangel rrangel@chromium.org 于2021年1月14日周四 上午6:21写道:
I'm trying to boot the Windows 10 Installer on a picasso based device using coreboot + tianocore. I keep getting a BSOD after the windows logo shows with the very descriptive stop code `ACPI BIOS ERROR`.
I've enabled bootdebug on the USB stick using the following:
bcdedit /store H:\boot\bcd /bootdebug {bootmgr} on bcdedit /store H:\boot\bcd /bootdebug {default} on bcdedit /store H:\boot\bcd /debug {debug} on
Here is the BCD:
C:\Windows\system32>bcdedit /store h:\boot\bcd Windows Boot Manager -------------------- identifier {bootmgr} description Windows Boot Manager locale en-US inherit {globalsettings} bootdebug Yes default {default} displayorder {default} toolsdisplayorder {memdiag} timeout 30 Windows Boot Loader ------------------- identifier {default} device
ramdisk=[boot]\sources\boot.wim,{7619dcc8-fafe-11d9-b411-000476eba25f} path \windows\system32\boot\winload.exe description Windows Setup locale en-US inherit {bootloadersettings} bootdebug Yes osdevice ramdisk=[boot]\sources\boot.wim,{7619dcc8-fafe-11d9-b411-000476eba25f} systemroot \windows bootmenupolicy Standard detecthal Yes winpe Yes debug Yes ems No
C:\Windows\system32>bcdedit /store h:\boot\bcd /dbgsettings debugtype Serial debugport 1 baudrate 115200
I have also added the SPCR table:
[000h 0000 4] Signature : "SPCR" [Serial Port
Console Redirection table] [004h 0004 4] Table Length : 00000050 [008h 0008 1] Revision : 02 [009h 0009 1] Checksum : F1 [00Ah 0010 6] Oem ID : "COREv4" [010h 0016 8] Oem Table ID : "COREBOOT" [018h 0024 4] Oem Revision : 0000002A [01Ch 0028 4] Asl Compiler ID : "CORE" [020h 0032 4] Asl Compiler Revision : 20200925
[024h 0036 1] Interface Type : 00 [025h 0037 3] Reserved : 000000 [028h 0040 12] Serial Port Register : [Generic Address
Structure] [028h 0040 1] Space ID : 00 [SystemMemory] [029h 0041 1] Bit Width : 20 [02Ah 0042 1] Bit Offset : 00 [02Bh 0043 1] Encoded Access Width : 03 [DWord Access:32] [02Ch 0044 8] Address : 00000000FEDC9000
[034h 0052 1] Interrupt Type : 03 [035h 0053 1] PCAT-compatible IRQ : 04 [036h 0054 4] Interrupt : 00000004 [03Ah 0058 1] Baud Rate : 00 [03Bh 0059 1] Parity : 00 [03Ch 0060 1] Stop Bits : 00 [03Dh 0061 1] Flow Control : 00 [03Eh 0062 1] Terminal Type : 00 [04Ch 0076 1] Reserved : 00 [040h 0064 2] PCI Device ID : FFFF [042h 0066 2] PCI Vendor ID : FFFF [044h 0068 1] PCI Bus : 00 [045h 0069 1] PCI Device : 00 [046h 0070 1] PCI Function : 00 [047h 0071 4] PCI Flags : 00000000 [04Bh 0075 1] PCI Segment : 00 [04Ch 0076 4] Reserved : 00000000
And the DBG2 table:
[000h 0000 4] Signature : "DBG2" [Debug Port
table type 2] [004h 0004 4] Table Length : 0000005C [008h 0008 1] Revision : 00 [009h 0009 1] Checksum : 78 [00Ah 0010 6] Oem ID : "COREv4" [010h 0016 8] Oem Table ID : "COREBOOT" [018h 0024 4] Oem Revision : 00000000 [01Ch 0028 4] Asl Compiler ID : "CORE" [020h 0032 4] Asl Compiler Revision : 20200925
[024h 0036 4] Info Offset : 0000002C [028h 0040 4] Info Count : 00000001 [02Ch 0044 1] Revision : 00 [02Dh 0045 2] Length : 0030 [02Fh 0047 1] Register Count : 01 [030h 0048 2] Namepath Length : 000A [032h 0050 2] Namepath Offset : 0026 [034h 0052 2] OEM Data Length : 0000 [Optional field
not present] [036h 0054 2] OEM Data Offset : 0000 [Optional field not present] [038h 0056 2] Port Type : 8000 [03Ah 0058 2] Port Subtype : 0000 [03Ch 0060 2] Reserved : 0000 [03Eh 0062 2] Base Address Offset : 0016 [040h 0064 2] Address Size Offset : 0022
[042h 0066 12] Base Address Register : [Generic Address
Structure] [042h 0066 1] Space ID : 00 [SystemMemory] [043h 0067 1] Bit Width : 20 [044h 0068 1] Bit Offset : 00 [045h 0069 1] Encoded Access Width : 03 [DWord Access:32] [046h 0070 8] Address : 00000000FEDC9000
[04Eh 0078 4] Address Size : 000000F8 [052h 0082 10] Namepath : "\_SB.FUR0"
Given all this I have yet to see windows dump anything to the serial console. The device just reboots after the BSOD message. Does anyone have any experience debugging Windows boot errors? Can you share any tips? :)
Thanks, Raul _______________________________________________ coreboot mailing list -- coreboot@coreboot.org To unsubscribe send an email to coreboot-leave@coreboot.org
Stoney has the same issue, pretty sure it's related to a memory address
range being incorrectly marked or something similar (based on the DWORD output of the BSOD), but never bothered to troubleshoot
How did you come to that conclusion?
Highly possible you don't need to connect live sessions using windbg, you
can analysis the generated dump file to simply open with windbg.
The installer media doesn't seem to create a dump :(
So I was finally able to get the debugger to attach. There are two bcd stores on the installer USB, one for BIOS and one for UEFI. I was modifying the BIOS one. Once I updated the UEFI store the OS seemed to hang for the debugger. The second problem was that the serial port needs to be using a 1.8MHz base clock. This way the OS can calculate the correct divisor to use. Apparently linux also assumes a 1.8MHz base clock. So once I got all that fixed I was able to attach with the debugger. Unfortunately the debugger was non-functional: https://docs.microsoft.com/en-us/answers/questions/232781/debugging-acpi-bio...
I then decided to ditch the install media and instead created a WinPE boot image that has a full OS install. This allowed me to modify the registry values offline to enable displaying the error codes: https://windowsreport.com/bsod-details-windows-8/
With that I was finally able to get the error codes:
0x0000000000000000 OxFFFFD38AC66EC7FO Ox000000004449555F 0x0000000000000000
I need to figure out how to decode them.
On Wed, Jan 13, 2021 at 7:37 PM Lance Zhao lance.zhao@gmail.com wrote:
Highly possible you don't need to connect live sessions using windbg, you can analysis the generated dump file to simply open with windbg.
Raul Rangel rrangel@chromium.org 于2021年1月14日周四 上午6:21写道:
I'm trying to boot the Windows 10 Installer on a picasso based device using coreboot + tianocore. I keep getting a BSOD after the windows logo shows with the very descriptive stop code `ACPI BIOS ERROR`.
I've enabled bootdebug on the USB stick using the following:
bcdedit /store H:\boot\bcd /bootdebug {bootmgr} on bcdedit /store H:\boot\bcd /bootdebug {default} on bcdedit /store H:\boot\bcd /debug {debug} on
Here is the BCD:
C:\Windows\system32>bcdedit /store h:\boot\bcd Windows Boot Manager -------------------- identifier {bootmgr} description Windows Boot Manager locale en-US inherit {globalsettings} bootdebug Yes default {default} displayorder {default} toolsdisplayorder {memdiag} timeout 30 Windows Boot Loader ------------------- identifier {default} device
ramdisk=[boot]\sources\boot.wim,{7619dcc8-fafe-11d9-b411-000476eba25f} path \windows\system32\boot\winload.exe description Windows Setup locale en-US inherit {bootloadersettings} bootdebug Yes osdevice ramdisk=[boot]\sources\boot.wim,{7619dcc8-fafe-11d9-b411-000476eba25f} systemroot \windows bootmenupolicy Standard detecthal Yes winpe Yes debug Yes ems No
C:\Windows\system32>bcdedit /store h:\boot\bcd /dbgsettings debugtype Serial debugport 1 baudrate 115200
I have also added the SPCR table:
[000h 0000 4] Signature : "SPCR" [Serial Port
Console Redirection table] [004h 0004 4] Table Length : 00000050 [008h 0008 1] Revision : 02 [009h 0009 1] Checksum : F1 [00Ah 0010 6] Oem ID : "COREv4" [010h 0016 8] Oem Table ID : "COREBOOT" [018h 0024 4] Oem Revision : 0000002A [01Ch 0028 4] Asl Compiler ID : "CORE" [020h 0032 4] Asl Compiler Revision : 20200925
[024h 0036 1] Interface Type : 00 [025h 0037 3] Reserved : 000000 [028h 0040 12] Serial Port Register : [Generic Address
Structure] [028h 0040 1] Space ID : 00 [SystemMemory] [029h 0041 1] Bit Width : 20 [02Ah 0042 1] Bit Offset : 00 [02Bh 0043 1] Encoded Access Width : 03 [DWord Access:32] [02Ch 0044 8] Address : 00000000FEDC9000
[034h 0052 1] Interrupt Type : 03 [035h 0053 1] PCAT-compatible IRQ : 04 [036h 0054 4] Interrupt : 00000004 [03Ah 0058 1] Baud Rate : 00 [03Bh 0059 1] Parity : 00 [03Ch 0060 1] Stop Bits : 00 [03Dh 0061 1] Flow Control : 00 [03Eh 0062 1] Terminal Type : 00 [04Ch 0076 1] Reserved : 00 [040h 0064 2] PCI Device ID : FFFF [042h 0066 2] PCI Vendor ID : FFFF [044h 0068 1] PCI Bus : 00 [045h 0069 1] PCI Device : 00 [046h 0070 1] PCI Function : 00 [047h 0071 4] PCI Flags : 00000000 [04Bh 0075 1] PCI Segment : 00 [04Ch 0076 4] Reserved : 00000000
And the DBG2 table:
[000h 0000 4] Signature : "DBG2" [Debug Port
table type 2] [004h 0004 4] Table Length : 0000005C [008h 0008 1] Revision : 00 [009h 0009 1] Checksum : 78 [00Ah 0010 6] Oem ID : "COREv4" [010h 0016 8] Oem Table ID : "COREBOOT" [018h 0024 4] Oem Revision : 00000000 [01Ch 0028 4] Asl Compiler ID : "CORE" [020h 0032 4] Asl Compiler Revision : 20200925
[024h 0036 4] Info Offset : 0000002C [028h 0040 4] Info Count : 00000001 [02Ch 0044 1] Revision : 00 [02Dh 0045 2] Length : 0030 [02Fh 0047 1] Register Count : 01 [030h 0048 2] Namepath Length : 000A [032h 0050 2] Namepath Offset : 0026 [034h 0052 2] OEM Data Length : 0000 [Optional field
not present] [036h 0054 2] OEM Data Offset : 0000 [Optional field not present] [038h 0056 2] Port Type : 8000 [03Ah 0058 2] Port Subtype : 0000 [03Ch 0060 2] Reserved : 0000 [03Eh 0062 2] Base Address Offset : 0016 [040h 0064 2] Address Size Offset : 0022
[042h 0066 12] Base Address Register : [Generic Address
Structure] [042h 0066 1] Space ID : 00 [SystemMemory] [043h 0067 1] Bit Width : 20 [044h 0068 1] Bit Offset : 00 [045h 0069 1] Encoded Access Width : 03 [DWord Access:32] [046h 0070 8] Address : 00000000FEDC9000
[04Eh 0078 4] Address Size : 000000F8 [052h 0082 10] Namepath : "\_SB.FUR0"
Given all this I have yet to see windows dump anything to the serial console. The device just reboots after the BSOD message. Does anyone have any experience debugging Windows boot errors? Can you share any tips? :)
Thanks, Raul _______________________________________________ coreboot mailing list -- coreboot@coreboot.org To unsubscribe send an email to coreboot-leave@coreboot.org
Hi Raul,
The installer behaves differently than the installed Windows OS, so I'd only try on an already installed OS at first. If you need to do some in-depth debugging, I'd also recommend using a checked build that has debug symbols available. Beware that the installed version is very picky regarding S3/S0ix, so the installation needs to match the target's configuration.
To decode the bug check values and their parameters, see https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/bug-check...
The third parameter you posted decodes to _UID (that one is 4 char ASCII stored as little endian number).
Regards, Felix
Forgot to add that to find out what the cause is the easiest way is probably having the installed image configured in a way that it'll write full kernel memory dumps to disk and then use !analyze -v in WinDbg on that generated kernel dump. At least that's what I remember from more than 1.5 years ago, so some of the info might not be 100% accurate.
Regards, Felix
Over the weekend I had the realization that SMI logging was enabled and interfering with WinDbg. Once I flashed a non-serial firmware WinDbg became a lot more stable and I was able to reliably attach to the boot loader debugger i.e., `/bootdebug {default}`. The OS debugger (`/debug {default} on`) was still not functioning though. I wasn't sure If the BSOD was happening in the boot loader or the OS kernel, so I stepped through the boot loader until I saw it jump to the OS. From there WinDbg failed to restore the connection. The exception happens before the OS is capable of writing a kernel dump. I was also suspecting it was happening before the debugger was set up since the connection could not be re-established.
I then saw Felix's reply:
To decode the bug check values and their parameters, see https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/bug-check...
The third parameter you posted decodes to _UID (that one is 4 char ASCII stored as little endian number).
The parameters I gathered last week:
0x0000000000000000 OxFFFFD38AC66EC7FO Ox000000004449555F 0x0000000000000000
The first parameter 0x0 wasn't listed in the table. So this led me to believe that maybe there was a problem parsing the ACPI tables. Felix suggested I use the [Microsoft ASL compiler](https://docs.microsoft.com/en-us/windows-hardware/drivers/bringup/microsoft-...) to decompile the AML and verify if the tables were valid.
I used linux to dump the ACPI tables via `/sys/firmware/acpi/tables/`, added the `.AML` suffix, and ran `asl.exe /u DSDT.AML`. This printed an error saying `NVSA was already defined`. Using iasl to decompile the table I saw the following:
``` External (NVSA) Name (NVSA, 0xCA6B2000) OperationRegion (GNVS, SystemMemory, NVSA, 0x1000) ```
The external reference was defined in `.asl`: https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th... The `Name` node was created by acpigen: https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th...
Removing the External from the `.asl`. results in the `iasl` compiler complaining about a missing reference. So I move the `Name` node to the SSDT table. This resulted in linux complaining that `GVNS` was invalid because it couldn't find `NVSA`. The ACPI spec says the following:
OperationRegion (RegionName, RegionSpace, Offset, Length) Operation regions are regions in some space that contain hardware registers for exclusive use by ACPI
control methods. ...
The entire Operation Region can be allocated for exclusive use to the ACPI subsystem in the host OS. Operation Regions that are defined within the scope of a method are the exception to this rule. These Operation Regions are known as “Dynamic” since the OS has no idea that they exist or what registers they use until the control method is executed.
I'm guessing that we can't move the `NVSA` node to the SSDT because the value is required when instantiating `OperationRegion`.
I'm not quite sure how to solve this. For now I just hard coded the address in the `OperationRegion`. I'm open to suggestions.
The second problem was `OIPG`. It is also defined twice:
https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th...
https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th...
Changing the callback to write to the SSDT table fixed that problem and I was finally able to decode `DSDT.AML` using the `Microsoft ASL Compiler`.
I then disabled `/bootdebug` and `/debug` since they weren't providing any value and were preventing me from seeing the BSOD error codes. One thing I noticed was that rebooting after a BSOD the boot loader would boot a "system restore" image. This image used a different registry than the OS so the error codes were not printed on the screen. Rebooting again the boot loader would load the OS. So each test required a double reboot. I'm also using Tianocore to boot Windows, which is super slow...
The BSOD this time looked identical to the previous one... But upon closer inspection the error code was different:
0x000000000000000D
... I went back and looked at the photo I took of the original BSOD and it was indeed 0x000000000000000D! The font made this easy to miss the first time around. Google Lens didn't pick it up either.
The exception now made sense:
ACPI could not find a required method or object in the namespace This bug check code is used if there is no _HID or _ADR present.
and to re quote Felix:
The third parameter you posted decodes to _UID (that one is 4 char ASCII stored as little endian number).
So a device was missing a _UID. I manually audited all the Device nodes in the DSDT and SSDT and indeed we had devices that were missing `_UID` and some devices that even had duplicate `_UID`s. When I fixes this I got a new BSOD:
0x06 - ACPI tried to find a named object, but it could not find the object. 0x<some pointer> 0x<some pointer> 0x<some pointer>
This was discouraging since I didn't have a way of dereferencing the pointers. I decided to double check the `SPCR` and the `DBG2` tables. The `DBG2` table was using `MMIO` while the `SPCR` was using `IO`. So I switched it over to `IO` since I knew that worked, set `/debug {default} on` and voila OS kernel debugger!
Doing `!analyze -v` showed the error and parameters. It did lock up trying to print the details section. Hitting the `Break` button cancelled the operation and I was able to continue. A simple `!nsobj <pointer>` showed that the FUR0 power resource wasn't being found: https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th... I suspect it's because the `AOAC` [node](https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th...) is defined as a bridge device. I commented out all the `AOAC` and power resources and was then greeted with:
0x1000D - _PRW specified with no wake-capable interrupts and at least one GPIO interrupt A device used both GPE and GPIO interrupts, which is not supported.
If I understand it correctly, that means that we can't use a GPE in the _PRW and a GPIO in the _CRS.
i.e., Device (CRFP) { Name (_HID, "PRP0001") // _HID: Hardware ID Name (_UID, Zero) // _UID: Unique ID Name (_DDN, "Fingerprint Reader") // _DDN: DOS Device Name
Name (_CRS, ResourceTemplate () // _CRS: Current Resource Settings { UartSerialBusV2 (0x002DC6C0, DataBitsEight, StopBitsOne, 0x00, LittleEndian, ParityTypeNone, FlowControlNone, 0x0040, 0x0040, "\_SB.FUR1", 0x00, ResourceConsumer, , Exclusive, ) GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000, "\_SB.GPIO", 0x00, ResourceConsumer, , ) { // Pin list 0x0006 } }) Name (_S0W, 0x04) // _S0W: S0 Device Wake State Name (_PRW, Package (0x02) // _PRW: Power Resources for Wake { 0x0A, 0x03 }) }
I'm guessing the _PRW needs to reference the GPIO controller, and that controller must have an `_AEI` defining the pin. Not really sure why Windows has a problem with mixed event types. For now I just commented out all the I2C and UART peripherals.
With all that I was finally able to boot into Windows!
Now on to making some CLs and fixing the remaining issues.
On Fri, Jan 15, 2021 at 5:52 PM Felix Held felix-coreboot@felixheld.de wrote:
Forgot to add that to find out what the cause is the easiest way is probably having the installed image configured in a way that it'll write full kernel memory dumps to disk and then use !analyze -v in WinDbg on that generated kernel dump. At least that's what I remember from more than 1.5 years ago, so some of the info might not be 100% accurate.
Regards, Felix
Do you have the failed DSDT table dumped? Even there's recent change around NVSA, but looks that's different.
Raul Rangel rrangel@chromium.org 于2021年1月21日周四 上午4:24写道:
Over the weekend I had the realization that SMI logging was enabled and interfering with WinDbg. Once I flashed a non-serial firmware WinDbg became a lot more stable and I was able to reliably attach to the boot loader debugger i.e., `/bootdebug {default}`. The OS debugger (`/debug {default} on`) was still not functioning though. I wasn't sure If the BSOD was happening in the boot loader or the OS kernel, so I stepped through the boot loader until I saw it jump to the OS. From there WinDbg failed to restore the connection. The exception happens before the OS is capable of writing a kernel dump. I was also suspecting it was happening before the debugger was set up since the connection could not be re-established.
I then saw Felix's reply:
To decode the bug check values and their parameters, see
https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/bug-check...
The third parameter you posted decodes to _UID (that one is 4 char ASCII
stored as little endian number).
The parameters I gathered last week:
0x0000000000000000 OxFFFFD38AC66EC7FO Ox000000004449555F 0x0000000000000000
The first parameter 0x0 wasn't listed in the table. So this led me to believe that maybe there was a problem parsing the ACPI tables. Felix suggested I use the [Microsoft ASL compiler]( https://docs.microsoft.com/en-us/windows-hardware/drivers/bringup/microsoft-... ) to decompile the AML and verify if the tables were valid.
I used linux to dump the ACPI tables via `/sys/firmware/acpi/tables/`, added the `.AML` suffix, and ran `asl.exe /u DSDT.AML`. This printed an error saying `NVSA was already defined`. Using iasl to decompile the table I saw the following:
External (NVSA) Name (NVSA, 0xCA6B2000) OperationRegion (GNVS, SystemMemory, NVSA, 0x1000)
The external reference was defined in `.asl`:
https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th... The `Name` node was created by acpigen:
https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th...
Removing the External from the `.asl`. results in the `iasl` compiler complaining about a missing reference. So I move the `Name` node to the SSDT table. This resulted in linux complaining that `GVNS` was invalid because it couldn't find `NVSA`. The ACPI spec says the following:
OperationRegion (RegionName, RegionSpace, Offset, Length) Operation regions are regions in some space that contain hardware
registers for exclusive use by ACPI control methods. ...
The entire Operation Region can be allocated for exclusive use to the
ACPI subsystem in the host OS.
Operation Regions that are defined within the scope of a method are the
exception to this rule. These Operation Regions are known as “Dynamic” since the OS has no idea that they exist or what registers they use until the control method is executed.
I'm guessing that we can't move the `NVSA` node to the SSDT because the value is required when instantiating `OperationRegion`.
I'm not quite sure how to solve this. For now I just hard coded the address in the `OperationRegion`. I'm open to suggestions.
The second problem was `OIPG`. It is also defined twice:
https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th...
https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th...
Changing the callback to write to the SSDT table fixed that problem and I was finally able to decode `DSDT.AML` using the `Microsoft ASL Compiler`.
I then disabled `/bootdebug` and `/debug` since they weren't providing any value and were preventing me from seeing the BSOD error codes. One thing I noticed was that rebooting after a BSOD the boot loader would boot a "system restore" image. This image used a different registry than the OS so the error codes were not printed on the screen. Rebooting again the boot loader would load the OS. So each test required a double reboot. I'm also using Tianocore to boot Windows, which is super slow...
The BSOD this time looked identical to the previous one... But upon closer inspection the error code was different:
0x000000000000000D
... I went back and looked at the photo I took of the original BSOD and it was indeed 0x000000000000000D! The font made this easy to miss the first time around. Google Lens didn't pick it up either.
The exception now made sense:
ACPI could not find a required method or object in the namespace This
bug check code is used if there is no _HID or _ADR present.
and to re quote Felix:
The third parameter you posted decodes to _UID (that one is 4 char ASCII
stored as little endian number).
So a device was missing a _UID. I manually audited all the Device nodes in the DSDT and SSDT and indeed we had devices that were missing `_UID` and some devices that even had duplicate `_UID`s. When I fixes this I got a new BSOD:
0x06 - ACPI tried to find a named object, but it could not find the
object.
0x<some pointer> 0x<some pointer> 0x<some pointer>
This was discouraging since I didn't have a way of dereferencing the pointers. I decided to double check the `SPCR` and the `DBG2` tables. The `DBG2` table was using `MMIO` while the `SPCR` was using `IO`. So I switched it over to `IO` since I knew that worked, set `/debug {default} on` and voila OS kernel debugger!
Doing `!analyze -v` showed the error and parameters. It did lock up trying to print the details section. Hitting the `Break` button cancelled the operation and I was able to continue. A simple `!nsobj <pointer>` showed that the FUR0 power resource wasn't being found:
https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th... I suspect it's because the `AOAC` [node]( https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th... ) is defined as a bridge device. I commented out all the `AOAC` and power resources and was then greeted with:
0x1000D - _PRW specified with no wake-capable interrupts and at least
one GPIO interrupt
A device used both GPE and GPIO interrupts, which is not supported.
If I understand it correctly, that means that we can't use a GPE in the _PRW and a GPIO in the _CRS.
i.e., Device (CRFP) { Name (_HID, "PRP0001") // _HID: Hardware ID Name (_UID, Zero) // _UID: Unique ID Name (_DDN, "Fingerprint Reader") // _DDN: DOS Device Name
Name (_CRS, ResourceTemplate () // _CRS: Current Resource Settings { UartSerialBusV2 (0x002DC6C0, DataBitsEight, StopBitsOne, 0x00, LittleEndian, ParityTypeNone, FlowControlNone, 0x0040, 0x0040, "\\_SB.FUR1", 0x00, ResourceConsumer, , Exclusive, ) GpioInt (Level, ActiveLow, Exclusive, PullDefault, 0x0000, "\\_SB.GPIO", 0x00, ResourceConsumer, , ) { // Pin list 0x0006 } }) Name (_S0W, 0x04) // _S0W: S0 Device Wake State Name (_PRW, Package (0x02) // _PRW: Power Resources for Wake { 0x0A, 0x03 }) }
I'm guessing the _PRW needs to reference the GPIO controller, and that controller must have an `_AEI` defining the pin. Not really sure why Windows has a problem with mixed event types. For now I just commented out all the I2C and UART peripherals.
With all that I was finally able to boot into Windows!
Now on to making some CLs and fixing the remaining issues.
On Fri, Jan 15, 2021 at 5:52 PM Felix Held felix-coreboot@felixheld.de wrote:
Forgot to add that to find out what the cause is the easiest way is probably having the installed image configured in a way that it'll write full kernel memory dumps to disk and then use !analyze -v in WinDbg on that generated kernel dump. At least that's what I remember from more than 1.5 years ago, so some of the info might not be 100% accurate.
Regards, Felix
Hi Raul,
Raul Rangel wrote:
With all that I was finally able to boot into Windows!
Thank you very much for documenting your findings and much of your method!
Do you see any ways to preempt this entire class of errors (Windows unhappy with ACPI tables) within coreboot in a systematic manner?
You described being able to extract what I understand as PASS/FAIL information from a custom boot image (WinPE) - is this something that we could try to test automatically?
And/or do you perhaps see a way to do something like static PASS/FAIL analysis with the MSFT ASL compiler? This wouldn't guarantee boot, but would at least catch some of the errors?
Kind regards
//Peter
Do you have the failed DSDT table dumped? Even there's recent change around NVSA, but looks that's different.
Here is the DSDT before any of my changes: https://0paste.com/158902
Do you see any ways to preempt this entire class of errors (Windows unhappy with ACPI tables) within coreboot in a systematic manner?
I wrote a pretty hack ASL parser so I could verify the UID. We could add some kind of DSDT sanity check build step. Though it won't catch any SSDT errors since those are generated at run time. I think the best thing is to add more checks to fwts. I submitted a bug for the _UID issue here: https://bugs.launchpad.net/fwts/+bug/1912532 We could additionally add another test to FWTS to catch error code 0x1000D.
It looks like FWTS also supports processing ACPI tables: https://wiki.ubuntu.com/FirmwareTestSuite/Reference#Processing_pre-dumped_da... So maybe adding FWTS to the coreboot toolchain and add a Kconfig to sanity check the DSDT isn't a bad idea?