Hi, The return value 0x80000007 is EFI_DEVICE_ERROR, which is unfortunately for us, not very specific.
The schematics will be a big help to you, if you can get them. However, you can probably manage without them.
There are 3 BIOS images in the update binary from Acer. I took the liberty of assuming the relevant one was the 8 MiB file, as your SPI flash is 64 Mb. I've checked this image, and while its silicon configuration modules now more closely align with Intel's code (*OpenBoardPkg and *BoardPkg, MinPlatformPkg and reference code), so the work I've done for an Acer Skylake laptop doesn't help you, the theory is the same.
The relevant modules for setting FSP-M configuration are BoardConfigInitPreMem and possibly PlatformInitPreMem. Since the RCOMP resistors and targets have a known structure (3 and 5 UINT16s, also with a somewhat predictable pattern), a good way to skip a lot of reverse engineering is to search a hexdump for known/common values, or eyeball the .data section. You may want to do this in IDA (costly, proprietary) or Ghidra (free, open-source), so that you can follow the references the code makes to the variables. I found some possible RCOMP resistors in BoardConfigInitPreMem, which brings us to 1, maybe 2, relevant functions.
Depending on the board ID, the function loaded at 0xFFF15F48 will assign SpdAddressTable, DqByteMap, DqsMapCpu2Dram, RcompResistors, RcompTargets and others. coreboot boards generally ignore Dqs settings these days (except for LPDDR3/4, as I understand), so I'll just give you the possible RCOMP values, as I see them: - RCOMP targets: { 100, 40, 40, 23, 40 }, or { 100, 40, 20, 20, 26 }. Apparently, there is also { 60, 26, 20, 20, 26 }, but this looks different to what I've seen. - RCOMP resistors: { 200, 81, 162 }, or { 121, 75, 100 }, or { 121, 81, 100 }
Board ID appears to be determined by PCD (essentially, UEFI global variables), so you should try dumping the PCD database with PcdGet from https://github.com/jyao1/EdkiiShellTool in a UEFI shell.
By the way, according to the code you also may have an (unpopulated) SPD at 0x50 (0xA0 in the code, this is just because the SMBUS address was shifted 1 bit left into the 8-bit address form. They are equivalent).
You also may need to set CaVrefConfig. "2" should be the correct value for DDR4. You should probably check that DqPinsInterleaved is correct, it's easiest if you can get this from the schematics.
Best regards, Benjamin
Am Tue, Sep 07, 2021 at 03:29:55PM -0400 schrieb Benjamin Doron:
Hi, The return value 0x80000007 is EFI_DEVICE_ERROR, which is unfortunately for us, not very specific.
Thank you very much, your detailed commentary enabled me to work trough the issue.
The schematics will be a big help to you, if you can get them. However, you can probably manage without them.
It seems some schematics are available commercially, but I tried to avoid that.
The relevant modules for setting FSP-M configuration are BoardConfigInitPreMem and possibly PlatformInitPreMem. Since the RCOMP
[...]
It seems the Acer board is indeed following the Intel reference implementation very closely. I wrote a small table of possible values, the ones you extracted and some others I found be hexdumping the PE file, and iterated over them in a bootloop, but always got the same DEVICE_ERROR.
Only after applying the Dq(s)Mapping (DqByteMapChX, DqsMapCpu2DramChX) from the Intel reference KabyLake board the result was FSP_SUCCESS!
By the way, according to the code you also may have an (unpopulated) SPD at 0x50 (0xA0 in the code, this is just because the SMBUS address was shifted 1 bit left into the 8-bit address form. They are equivalent).
Corrected.
You also may need to set CaVrefConfig. "2" should be the correct value for DDR4. You should probably check that DqPinsInterleaved is correct, it's easiest if you can get this from the schematics.
Corrected (it was 0).
So that is already a good result. The Laptop booted right up through SeaBIOS into Linux. Now the task is to fix the ACPI / PCI configuration. Obviously a lot of devices on PCI / SMBUS are still missing.
The devicetree I can figure my way around, and hack together a basic one that enables all necessary ports and devices, I guess.
But I am a bit lost regarding the ACPI tables. What is necessary, and is there a way to read the vendor tables and use them?
Best regards, Benjamin
Thanks again,
Andreas
On Wed, Sep 8, 2021 at 8:13 AM Andreas Bauer andreas.bauer.nexus@gmail.com wrote:
Am Tue, Sep 07, 2021 at 03:29:55PM -0400 schrieb Benjamin Doron:
Hi, The return value 0x80000007 is EFI_DEVICE_ERROR, which is unfortunately
for
us, not very specific.
Thank you very much, your detailed commentary enabled me to work trough the issue.
Sounds great! Glad I could help, just as others helped me once.
The schematics will be a big help to you, if you can get them. However,
you
can probably manage without them.
It seems some schematics are available commercially, but I tried to avoid that.
The relevant modules for setting FSP-M configuration are BoardConfigInitPreMem and possibly PlatformInitPreMem. Since the RCOMP
[...]
It seems the Acer board is indeed following the Intel reference implementation very closely. I wrote a small table of possible values, the ones you extracted and some others I found be hexdumping the PE file, and iterated over them in a bootloop, but always got the same DEVICE_ERROR.
Only after applying the Dq(s)Mapping (DqByteMapChX, DqsMapCpu2DramChX) from the Intel reference KabyLake board the result was FSP_SUCCESS!
What do you mean that you iterated over the table of possible sets of values? I guess it's possible that you tracked the boot number... Additionally, I find it interesting that the Dq/Dqs settings were required. Someday, I'd like to get to the bottom of that, but perhaps someone has, and that's why some boards now accept the defaults in the FSP binary.
By the way, according to the code you also may have an (unpopulated) SPD
at
0x50 (0xA0 in the code, this is just because the SMBUS address was
shifted
1 bit left into the 8-bit address form. They are equivalent).
Corrected.
You also may need to set CaVrefConfig. "2" should be the correct value
for
DDR4. You should probably check that DqPinsInterleaved is correct, it's easiest if you can get this from the schematics.
Corrected (it was 0).
These settings aren't guaranteed to be correct, but according to the first page of the schematics (which is available on the sites selling them), there are two DIMM slots.
Speculating on CaVrefConfig is easier. As I understand, "0" matches very early HW designs, and Intel changed the design to "2" when they found that it was easier for the HW design in some way. I don't know the story behind DqPinsInterleaved, so I don't know how to determine it without schematics or knowing the internal struct definitions of the vendor code.
Anyways, this is all good enough if you can boot, and you could always revise these if you get more information.
So that is already a good result. The Laptop booted right up through SeaBIOS into Linux. Now the task is to fix the ACPI / PCI configuration. Obviously a lot of devices on PCI / SMBUS are still missing.
The devicetree I can figure my way around, and hack together a basic one that enables all necessary ports and devices, I guess.
I recommend getting into the vendor FW's "advanced" settings. It should give you most of the settings provided there to the FSP, but you can probably boot without getting everything right.
In the past, the IFR (Internal Forms Representation) could be hex-edited so that the entry "i-Page" would be displayed, which would allow the "Advanced" and "Power" pages to be enabled at runtime. However, it appears that Insyde is now using a more convoluted set of "suppressif" statements and I haven't unraveled these.
You can use IFRExtractor (on GitHub, you need the newer fork) to obtain the defaults, but it's a very large text file to look through (1.5 MiB). You could also use Insyde H2OUVE (proprietary, sometimes appears on the internet) on either a live system, a dumped or stock image to view the Setup. There's a drop-down in the tool to ignore the "suppressif" statements.
There are also other ways to get the same information: an inteltool and full lspci dump, which you then corroborate with the CPU and PCH datasheets available from Intel. This would require you to interpret the registers in the datasheet to get values for FSP UPDs. Also, since there are an enormous amount of registers, it's only something you can feasibly do to find the values for specific UPDs you're uncertain about.
However, I don't think that this would get you the HSIO (High Speed I/O) settings, which tune the devices for the board-specific PCB routing. It's possible to follow how https://github.com/tianocore/edk2-platforms/blob/master/Platform/Intel/Kabyl... determines how the UPDs are set in https://github.com/tianocore/edk2-platforms/blob/master/Platform/Intel/Kabyl... and corroborate the table with a dump of relevant PCRs by inteltool.
If you're considering trying it, I recommend dumping the "MODPHYx" (read, HSIO) PCRs (Private Configuration Registers, their number determines the base offset in the P2SB - Primary-Sideband device) before-and-after. To understand the differences, you would need to follow how that table (which contains register offsets, values and the means to determine the relevant PCR) sets the UPDs and corroborate that with the differences in the registers for each PCR. So, regarding the table: the first value of each entry is the HSIO lane. The lane number gives the relevant "MODPHY" PCR number by implication. The third entry is the register offset in this PCR, the fourth and fifth are the value and a bitwise AND bitmask. There's a fixed mapping between lanes and devices, determined by the *Get*LanNum functions at https://github.com/tianocore/edk2-platforms/blob/master/Silicon/Intel/Kabyla..., based on registers in the FIA PCR, and this is how you determine which device's UPD to set. Note that the second entry of the table is the lane "owner" (PCIe, SATA, possibly USB3), but this is purely informational because you find yours in the FIA PCR. Finally, a slightly shorter way is to use the code in PeiPchPolicyUpdatePreMem.c to see which registers it's interested in (definitions in KabylakeSiliconPkg directory of the same repository) and only consider those before-and-after differences.
Yikes. I realise that this gets rather complicated, but I do want people to write better ports. If you send me the inteltool dumps for the MODPHY PCRs (0xEA, 0xE9, 0xA9 and 0xA8) and the FIA PCR (0xCF), I'll try to have a look when I get the chance. I think these PCRs are safe to dump, but to be safe, don't try it when an unexpected shutdown/hang would be problematic.
Hopefully you can get the USB2 config from the advanced settings. These can't be found in the HSIO PCRs, they are an "AFE" (analog front-end?) instead. There is a USB2 PCR, but I don't know the definitions. If you can't get the config, as always, the usual coreboot definitions in devicetree.cb are fairly good defaults.
On SMBUS: this laptop uses an Embedded Controller by ITE, so you may need one of the coreboot drivers to configure some of its settings. However, if the keyboard/touchpad work, this may be working already.
Some early warning about the HDA verbs: if the headphone jack doesn't work, it's likely that there are additional verbs that you couldn't retrieve at runtime. You can find them by expanding one of the AZALIA_* macros and searching a hexdump of the PeiBoardConfigInit module, which should be providing its table to FSP-S.
But I am a bit lost regarding the ACPI tables. What is necessary, and is there a way to read the vendor tables and use them?
Use `acpidump -b` to retrieve all tables from a running system and `iasl -d *` to disassemble them. It's the EC device that's most interesting. Copy it to a new .asl file (see other boards' acpi directories for examples), with some caveats: - If you need to include a method that issues a write to a register in a "SystemIO" OperationRegion at 0xB2, that method is calling SMM code, which you don't have. You could reverse engineer the vendor's SMM code, but otherwise, remove the method issuing the write. You could also replace it with something like `Debug = "SMM called from method x"`, which is like printf for ACPI. These prints will appear in dmesg output if the kernel is built appropriately. - Remove notifications to a "WMI" device. These are also SMM calls. - If the primary OperationRegion with most EC registers is "SystemMemory," then the firmware is using the LGMR (LPC Generic Memory Range) feature. coreboot now supports this properly, so you would call lpc_open_mmio_window() in soc/intel/common/block/lpc/lpc_lib with the memory range.
I think that would be enough to create an initial ACPI table. Note that some ACPI devices, such as I2C HID (Human Interface Design, essentially, drivers to create more responsive devices) touchpads, can be created dynamically by coreboot through devicetree.cb.
Since this laptop apparently has a dGPU, you can enable power to it by examining its ACPI table for the GPIOs toggled in "SGON"/"HGON" by the "SGPO" - set GPIO output - and related methods. Since you don't have the schematics, dump the memory of the OperationRegion containing these registers, which give you the GPIO encoded as a hex number, ORed with the high bit if the GPIO is active low, as I recall?
I've definitely missed some parts, but after you consider adding some additional coreboot features, your initial port is done. For instance, the OS probably needs an NHLT ACPI table to make the microphone work (see examples in other mainboard.c files. It's a couple of CONFIG_ settings and function calls, wrapped around a blob).
Feel free to ask me to clarify any of this. Porting a board can involve many technologies, but I think it can teach a lot. Anyways, since you can boot already, hopefully this helps you finish your initial port, and you can fine-tune different aspects yourself, as you learn about them later.
Best regards, Benjamin
Thanks again,
Andreas
Best regards, Benjamin
Am Wed, Sep 08, 2021 at 06:37:17PM -0400 schrieb Benjamin Doron:
It seems the Acer board is indeed following the Intel reference implementation very closely. I wrote a small table of possible values, the ones you extracted and some others I found be hexdumping the PE file, and iterated over them in a bootloop, but always got the same DEVICE_ERROR.
Only after applying the Dq(s)Mapping (DqByteMapChX, DqsMapCpu2DramChX) from the Intel reference KabyLake board the result was FSP_SUCCESS!
What do you mean that you iterated over the table of possible sets of values? I guess it's possible that you tracked the boot number... Additionally, I find it interesting that the Dq/Dqs settings were required. Someday, I'd like to get to the bottom of that, but perhaps someone has, and that's why some boards now accept the defaults in the FSP binary.
Yes, I enabled bootnum counter and flash log, and encoded a variety of possible combinations of those values in a table, and then patched drivers/intel/fsp2.0/memory_init.c to iterate over bootcounter MOD num_configs. If I got anything other than FSP_SUCCESS, full_reset()
That way I could try several different configurations without reflashing between each try.
By the way, during this process a thought occured to me: this process of manually poking different configurations, basically trial and error, does not scale very well.
IF we have a vendor bios that does all the magic sauce, and even know something about the structure of that vendor bios, and usually have enough space in the flash to fit both coreboot and vendor bios into the same chip, why not develop a sort of debugging hypervisor that documents all the mm/io registers modified, exacting timestamps and code flow ?
Such a detailed boot log of all settings could be compared between different implementations of the same platform and a new port would be much easier.
// topic change
Thanks again for the detailed comments on further development. The status right now is missing I2C bus, missing a few PCI devices, some stability issues with the Wifi card which I was able to activate behind a PCIe root hub, and in general higher power consumption. S3 sleep is working though!
So a few things to work through, but I have to also give attention to other things outside computers. I will get back to this once time permits.
By the way, the vendor bios is extremely basic. There is litterally no single setting beside the glaring necessary. It is maybe 20 settings over 5 pages, that's it. This laptop is supposed to target the computer illerates I guess.
regards,
Andreas
On 09.09.21 08:04, Andreas Bauer wrote:
By the way, during this process a thought occured to me: this process of manually poking different configurations, basically trial and error, does not scale very well.
IF we have a vendor bios that does all the magic sauce, and even know something about the structure of that vendor bios, and usually have enough space in the flash to fit both coreboot and vendor bios into the same chip, why not develop a sort of debugging hypervisor that documents all the mm/io registers modified, exacting timestamps and code flow ?
Such a detailed boot log of all settings could be compared between different implementations of the same platform and a new port would be much easier.
You mean something like serialice.com? ;) This is not using a hypervisor but running the firmware in an emulator. But otherwise pretty much what you describe. It's always a bit fiddly to get it running, but I guess with a hypervisor it would be about the same.
Nico
Hi,
On 09.09.21 00:37, Benjamin Doron wrote:
On Wed, Sep 8, 2021 at 8:13 AM Andreas Bauer andreas.bauer.nexus@gmail.com wrote:
Am Tue, Sep 07, 2021 at 03:29:55PM -0400 schrieb Benjamin Doron:
The relevant modules for setting FSP-M configuration are BoardConfigInitPreMem and possibly PlatformInitPreMem. Since the RCOMP
[...]
It seems the Acer board is indeed following the Intel reference implementation very closely. I wrote a small table of possible values, the ones you extracted and some others I found be hexdumping the PE file, and iterated over them in a bootloop, but always got the same DEVICE_ERROR.
Only after applying the Dq(s)Mapping (DqByteMapChX, DqsMapCpu2DramChX) from the Intel reference KabyLake board the result was FSP_SUCCESS!
What do you mean that you iterated over the table of possible sets of values? I guess it's possible that you tracked the boot number... Additionally, I find it interesting that the Dq/Dqs settings were required. Someday, I'd like to get to the bottom of that, but perhaps someone has, and that's why some boards now accept the defaults in the FSP binary.
these settings are about soldered-down memory chips and how things are routed on the mainboard. I'm not sure if there are reasonable FSP defaults.
AFAIK, boards with DIMM slots don't need these settings. However, due to extensive copy-pasting such settings in the past, coreboot code made it look like all boards would require them.
I'm very curious now how your overall settings look now. It seems possible that FSP is using these values while it shouldn't.
On 07.09.21 21:29, Benjamin Doron wrote:
- RCOMP targets: { 100, 40, 40, 23, 40 }, or { 100, 40, 20, 20, 26 }.
Apparently, there is also { 60, 26, 20, 20, 26 }, but this looks different to what I've seen.
- RCOMP resistors: { 200, 81, 162 }, or { 121, 75, 100 }, or { 121, 81, 100
}
Intel provides these numbers (and what resistors are supposed to be soldered on the board) per platform and memory combination (e.g. 1 DIMM per channel, 2, or various soldered-down configurations).
For KBL-U and SODIMMs it's 121, 81, 100; 100, 40, 20, 20, 26.
Actually, with the table at hand one can derive these numbers from other settings and runtime detection easily. Why Intel makes FSP configuration so complicated is a mystery. They have them in an NDA spreadsheet. They could either put that data into coreboot or FSP but instead they prefer to make things a PITA. Probably to keep the lie up that everything is so complicated.
Nico