Recently I'm finally able to get a KGPE-D16 coreboot environment that could always boot, although there are a lot of issues and questions. I'm currently using the 4.11 branch as this board has been dropped in the master branch.
The main issue is memory. It seems there are still issues with memory initialization. The Samsung 16GB sticks I currently have (normal or VLP) do not work when using all orange slots, ending in a bootloop due to DIMM training failure (which is the same error I had when I was trying coreboot on KCMA-D8). However, on KGPE-D16, by installing the 8 sticks to the following slots: D1/D2, B1/B2, E1/E2 and G1/G2, I could always get it past the memory initialization, albeit with this error:
"TrainDQSRdWrPos: negative DQS recovery delay detected! Attempting to continue but your system may be unstable..."
This is currently the only combination that could always boot (and with all 128GB recognized). Other combinations I tested either resulted in some "HARDWARE FAULT: HT_EVENT_HW_SYNCHFLOOD" errors which would soft reset before even reaching the DIMM training part, or it entered the DIMM training part but would never complete.
However, the system is currently not at a point that could properly boot to a system (or even a payload) yet. As of yesterday, the only thing I could boot and steadily function is a Menuet64 floppy image.
1. I'm using SeaBIOS master, and have put several floppy images into the CBFS, but it seems to only list the last floppy image I added as a boot option (Ramdisk). It appears multiple floppy images is not officially supported in master yet and I've just located the patch regarding this. I'm still new to SeaBIOS so I'm not sure how to modify configuration/bootorder later on, as I currently don't have a configuration/bootorder file supplied and let it use default values.
2. None of the secondary payloads work. nvramcui hangs with something like "unable to init curs" (I only tried it once and couldn't remember what exactly it is). coreinfo and tint hangs with this "Not enough memory creating EHCI periodic frame list." error. memtest could load, but when it starts testing, it hangs with a glitched screen of blinking characters, among which something like "d h l p t x" could be seen. This also happened when using memtest that comes with the Manjaro Linux boot USB.
3. Initially I had issues with graphical bootloaders like grub, but after disabling bootsplash image as well as not building the seavgabios.bin (by unchecking the "Include generated option rom that implements legacy VGA BIOS compatibility"), I could get Manjaro's graphical grub to load, but still cannot boot to the kernel (XZ-Compressed Data is corrupt). Maybe this is due to memory, as I think the combination I previously mentioned only guaranteed it to go past the training part, but not anything after that.
4. Does this board support any other graphical initialization methods? It seems I can only choose "native graphics init" as well as "legacy VGA text mode" framebuffer. I use SeaBIOS with a dedicated nVidia video card (onboard not used), and it seems without bootsplash image or seavgabios.bin I'm able to load something graphical (such as grub and Menuet64).
5. It seems the board would ramp up the fans to max once it passes the DIMM training and memory clear part (the fans seem to spin at a relatively quiet speed before that, similar to Stock BIOS' Generic profile). As the CPU fans I'm currently using can be really noisy at more than 4300 rpm max, I ended up using a 5.25'' external fan controller (non-PWM) and manually toned it down to an acceptable 2900 rpm (it seems I can't get it to any lower without using a PWM-capable controller). Is there a way to tweak the fan control profiles of the onboard CPU fan ports like the stock BIOS?
This is the first time I ever manage to get a bootable coreboot system, so I'm not sure which part to look into for my issues, especially the RAM part. Is there an up-to-date RAM HCL and guidelines for this board?
At this point, I can always power it on to conduct more tests on this board for any additional logs if needed.
Just got back and tested a bit further.
1. The actual message nvramcui printed when hung was: initscr(): Unable to create SP exited with status 8
2. May consider switching SeaBIOS to stable (1.12.1) and try again, as some SeaBIOS patches (like the one that enables multiple floppy disks) currently do not apply on master. I'm interested in SeaBIOS' support status for NVMe on recent versions, but I've read somewhere that the support was initially only enabled for QEMU, and I'm not sure about the current versions.
A summary of the status so far.
1. If I boot without plugging my USB keyboard/mouse and only use PS/2 ones, I get the following errors instead of "Not enough memory creating EHCI periodic frame list." on coreinfo. Not enough DMA memory for OHCI HCCA. (This happened only once.) drivers/usb/usb.c:42 new_controller(): Failed to malloc 588 bytes.
Not sure where to configure the memory-related stuffs for USB, but the USB stack is certainly complaining about running out of memory. Is there a way to investigate how much memory was used/available to the system at that point?
2. Without USB keyboard/mouse I can get memtest into SMP mode. However, it can only go as far as about 50-60% of the first test (address only, walking ones, no cache) before entering the same glitched screen. Non-SMP mode enters the glitched screen immediately, same as with USB keyboard/mouse.
3. With working 128GB RAM configuration, memtest correctly shows DRAM SPD (1600MHz CL11), but throws errors every 2GB past 32GB with the good and bad values showing inverted contents (bad is ffffffff when good is 00000000, and vice versa).
Yesterday I did a test with only 32GB by removing 6 of the 8 sticks installed. However, the DRAM SPD is not shown correctly, giving me a 4-5-4 timing of 0Mhz. With 32GB, there were no errors during the test process, but still entered the glitched screen at the same location (about 50-60% of the first test). This is tested using memtest master (the glitched screen also existed in stable and the result was the same).
4. For 32GB configuration (2 x 16GB sticks), installing to the closest orange slot of each CPU would not boot, it booted when I installed the sticks to the second closest orange slot of each CPU.
5. Mike Banon's patches for boot menu and multiple floppy images applied without issues on SeaBIOS master (not sure why patchew.org's catalog showed some patches failed to apply). Most of the floppy images I included could boot (including graphical ones like Menuet64), but for images that come with their own USB stack, USB do not function at all (which means I have to use my PS/2 keyboard/mouse to work with them).
6. USB disks can be picked up and may load the bootloader, but this is the farthest I could go as for most USB sticks (Windows, Linux or others), the startup process would hang without any apparent messages. Some (not all) USB disks attached to ports provided by a PCIe x4 USB 3.0 (8-port) adapter can also be picked up by SeaBIOS with varying results (some could be entered before it hangs during boot, while others would fail and SeaBIOS boots the main ramdisk instead).
7. Disabling serial log console doesn't fix any issues I'm currently having, although interestingly, memtest can also print some screen contents to the serial console if it's enabled. I'm considering re-enabling it for the time being so I'll be able to see if a particular memory combination is working.
It seems there are still problems with memory initialization although some combinations can be made to boot, but there are major issues with the USB stack that is preventing me from doing any further testing as the board currently doesn't have a SATA optical drive for installing operating systems and such.
ragnaros@tenebr.is wrote:
- For 32GB configuration (2 x 16GB sticks), installing to the
closest orange slot of each CPU would not boot, it booted when I installed the sticks to the second closest orange slot of each CPU.
If you install memory modules as far *away* from CPU/chipset as possible then you create more margin in DRAM signal integrity, which can make the system work more reliably even if memory initialization is not perfect.
The reason is that signals reflect everywhere on the memory bus. When the chipset drives signals and modules are installed nearby, the signal will reflect back at the last slot and possibly interfere with either the controller's request or the DRAM's response.
The same happens when the DRAM drives signals, in response to requests. They go out from the DRAM and both left to the chipset and right towards those unpopulated memory slots, and then reflects there, possibly interfering with what the DRAM sent or with the next request from the controller.
Mainboard memory busses, especially with many slots, go right up to the limit of physics, and yes there is supposed to be a little bit of margin and there are some workarounds available, but all of that is the responsibility of the memory initialization, and it's very easy to not get everything 100% right. Then stuff doesn't always work.
With this in mind, the safest bet should be, to populate a single DRAM module per channel, as far away from the chipset as possible.
//Peter
Hi there, KGPE-D16 friends.
1) Please look at my not-merged-yet "AMD_XMP" changes on review.coreboot.org: they could help you to either use a XMP 1 or XMP 2 memory profile (should exist on ALL your sticks), or - preferably - to set up your custom memory profile that will override the SPD values. This way, maybe you could figure out the values to make your RAM sticks run stable. You'll need to port this code to AMD fam10h architecture to which KGPE-D16 belongs (if I'm not mistaken), hopefully that wouldn't be hard.
2) USB is broken at floppy-based OS - maybe fam10h could have the same "IRQ routing" issue as fam15h has. Sophisticated OS like Linux somehow get around this issue, while the more simple OS couldn't.
3) Glad you liked my floppy patches and collection: there are certainly a lot of goodies that are fun and useful. Usually the latest revisions of these, and other, unofficial patches could be automatically obtained with a csb_patcher.sh script from 33509 change to save your time - I'm trying my best to keep up this stuff with a coreboot master, as well as to get merged at least some of these changes to reduce this maintenance.
Best regards, Mike Banon
On Thu, May 14, 2020 at 9:28 PM Peter Stuge peter@stuge.se wrote:
ragnaros@tenebr.is wrote:
- For 32GB configuration (2 x 16GB sticks), installing to the
closest orange slot of each CPU would not boot, it booted when I installed the sticks to the second closest orange slot of each CPU.
If you install memory modules as far *away* from CPU/chipset as possible then you create more margin in DRAM signal integrity, which can make the system work more reliably even if memory initialization is not perfect.
The reason is that signals reflect everywhere on the memory bus. When the chipset drives signals and modules are installed nearby, the signal will reflect back at the last slot and possibly interfere with either the controller's request or the DRAM's response.
The same happens when the DRAM drives signals, in response to requests. They go out from the DRAM and both left to the chipset and right towards those unpopulated memory slots, and then reflects there, possibly interfering with what the DRAM sent or with the next request from the controller.
Mainboard memory busses, especially with many slots, go right up to the limit of physics, and yes there is supposed to be a little bit of margin and there are some workarounds available, but all of that is the responsibility of the memory initialization, and it's very easy to not get everything 100% right. Then stuff doesn't always work.
With this in mind, the safest bet should be, to populate a single DRAM module per channel, as far away from the chipset as possible.
//Peter _______________________________________________ coreboot mailing list -- coreboot@coreboot.org To unsubscribe send an email to coreboot-leave@coreboot.org
It seems some standard combinations as recommended by the manual do not work with coreboot while some non-standard combinations do, so one cannot really refer to the ASUS manual for memory installation if using coreboot.
However, it seems memory is not the main issue here as I just tried using 4 x 8GB Micron DDR3-1600 ECC Unbuffered sticks for 32GB, and it did not make any difference compared to 2 x 16GB Samsung DDR3-1600 ECC REG VLP sticks that I previously used.
The issue I'm having is more likely USB-related and it seems the issue is with libpayload. The function "dma_memalign" might be the culprit, as the error message "Not enough memory creating EHCI periodic frame list" is to happen if dma_memalign returned null.
Reference: payloads/libpayload/drivers/usb/ehci.c (from line 809): /* Initialize periodic frame list */ /* 1024 32-bit pointers, 4kb aligned */ u32 *const periodic_list = (u32 *)dma_memalign(4096, 1024 * sizeof(u32)); if (!periodic_list) fatal("Not enough memory creating EHCI periodic frame list.\n");
By the way, I'm using two Opteron 6386 SE CPUs. Not sure if 6300 family requires something else other than microcode (which has been generated from tree).
I have two KGPE-D16's running coreboot 4.11. One runs as my server and uses FreeBSD, the other is my desktop and dual boots FreeBSD and Linux. Both work great. Both have 64 GB RAM (4x Kingston KVR16R11D4/16). You need to remember that the manual part for positioning RAM sticks in the ASUS manual is wrong, I needed to put those sticks into other slots (this was also true for ASUS BIOS).
One other issue is that redirecting SeaBIOS output to serial console makes video output unavailable if you use AST2050(you get only serial console). But once OS is booted, it takes control of that, so you can have both serial and video after booting. Using both consoles works in SeaBIOS when using external GPU, however.
Other than those two issues, boards work great and I'm kind of curious why people have so many problems.
One more issue I just remembered is that FreeBSD was unstable with cpu_cc6_state enabled. It just crashed when putting more load on the system (no panic whatsoever, just a reboot). Disabling it made things stable.
Maybe try disabling it?
Hi,
3mdeb office has two of these KGPE-D16 boards. We also experience some issues with booting. Built an image from 4.11 branch using stable SeaBIOS and enabled console over serial port. Using 1x Kingston KVR16R11D4/16 16GB ECC RAM and booting without issues.
2. Secondary payloads may require some tuning of libpayload config, for example to increase stack size and heap size (noticed issues with USB when the stack or heap was too small).
3. The graphics is exposed by BMC ASPEED AST2050, it should allow to use display. The problems may occur when GRUb is switching display mode or something. The kernel booting problems may be related to memory training problem.
4. If you have a discrete GPU, you may include its VGA option ROM into CBFS using correct naming, SeaBIOS should execute it and provide graphics output. There is also an option in Kconfig "Onboard VGA is prmiary", try to deselect it when using discrete GPU.
5. If you do not have BMC running, the fans will keep running at full speed IIRC. BMC was responsible for fan control. In order to utilize BMC one needs the BMC flash chip module. The fans are spinning up to max speed during hardware enumeration in ramstage, probably during Super IO intialization and BMC initialization. I am not yet familiar with this board. However maybe if you look into Winbond chip drivers used on KGPE you may find some settings to tweak the speed, I am not sure.
The only RAM HCL I found are:
https://www.coreboot.org/Board:asus/kgpe-d16#RAM_HCL
https://libreboot.org/docs/hardware/kgpe-d16.html#memory-compatibility-with-...
https://www.raptorengineering.com/coreboot/kgpe-d16-status.php
I would gladly help to resolve those issues, but we lack resources to do it "pro publico bono". 3mdeb is trying to change it together with Insurgo Technologies Libres, we are gathering funds for the development and revivial of this platform: https://github.com/osresearch/heads/issues/719
Regards,
On 09.05.20 12:51, Michal Zygowski wrote:
- If you have a discrete GPU, you may include its VGA option ROM into
CBFS using correct naming, SeaBIOS should execute it and provide graphics output. There is also an option in Kconfig "Onboard VGA is prmiary", try to deselect it when using discrete GPU.
There should be no need to add the option ROM of a plug-in card. SeaBIOS should automatically pick it up. Beside turning "Onboard VGA is primary" _off_. You should also disable all graphics init in coreboot ("Graphics initialization (None)"). For plug-in cards, it's best to leave it to SeaBIOS.
Hope that helps, Nico