Hello.
I've been working on porting coreboot to Asrock X370 Killer Sli/AC, and I've made quite a lot of progress, and can even boot arch linux off usb from either seabios or tianocore.
However, there's a number of odds and ends that need to be fixed which will require a lot of build/flash/boot cycles to wholly figure out. Problem is, the act of booting linux in my board's current state is an absolute chore to perform.
After grub loads the kernel and initramfs, it goes quite smoothly for a while, but at some point, it slows to an absolute crawl/almost halts entirely. I've found that by plugging and unplugging a usb keyboard it causes the boot to progress a bit more each time, until it gets to the point where getty@ttyS0 comes up and sshd is running. Once ssh'd in, I can basically use it just like it was the vendor firmware, a few outstanding issues notwithstanding. This slowing/halting of boot also occurs if I issue a shutdown command and requires the same plug/unplug manipulation to make it actually finish.
It has been suggested to me in a few places that its a lack of a timer irq getting through, but I'm not certain that's actually it but its the only lead I have to go on right now. My current state of code is at: https://review.coreboot.org/c/coreboot/+/77879 and is largely based off the zork chromebooks and the mandolin devel board's code. As I can see no particular handling of timer irqs in either I'm at a loss as to what could be the missing piece of the puzzle.
Once I can build/flash/boot a bit faster I'm almost certain I can get the usb ports coming off the x370 chipset working (pcie and sata both appear to be fully functional as of this moment, more testing required) but as mentioned, it is *such a painful process* to actually boot linux on it in its current state.
Any suggestions or solutions greatly appreciated.
Regards, Marty
Dear Marty,
Am 26.10.23 um 12:29 schrieb Marty E. Plummer via coreboot:
I've been working on porting coreboot to Asrock X370 Killer Sli/AC, and I've made quite a lot of progress, and can even boot arch linux off usb from either seabios or tianocore.
However, there's a number of odds and ends that need to be fixed which will require a lot of build/flash/boot cycles to wholly figure out. Problem is, the act of booting linux in my board's current state is an absolute chore to perform.
After grub loads the kernel and initramfs, it goes quite smoothly for a while, but at some point, it slows to an absolute crawl/almost halts entirely. I've found that by plugging and unplugging a usb keyboard it causes the boot to progress a bit more each time, until it gets to the point where getty@ttyS0 comes up and sshd is running. Once ssh'd in, I can basically use it just like it was the vendor firmware, a few outstanding issues notwithstanding. This slowing/halting of boot also occurs if I issue a shutdown command and requires the same plug/unplug manipulation to make it actually finish.
It has been suggested to me in a few places that its a lack of a timer irq getting through, but I'm not certain that's actually it but its the only lead I have to go on right now. My current state of code is at: https://review.coreboot.org/c/coreboot/+/77879 and is largely based off the zork chromebooks and the mandolin devel board's code. As I can see no particular handling of timer irqs in either I'm at a loss as to what could be the missing piece of the puzzle.
Once I can build/flash/boot a bit faster I'm almost certain I can get the usb ports coming off the x370 chipset working (pcie and sata both appear to be fully functional as of this moment, more testing required) but as mentioned, it is *such a painful process* to actually boot linux on it in its current state.
Any suggestions or solutions greatly appreciated.
Thank you for your message and your detailed description. It’d be great, if you could also attach the coreboot and Linux log messages.
Kind regards,
Paul
On Thu, Oct 26, 2023 at 01:50:10PM +0200, Paul Menzel wrote:
Dear Marty,
Am 26.10.23 um 12:29 schrieb Marty E. Plummer via coreboot:
I've been working on porting coreboot to Asrock X370 Killer Sli/AC, and I've made quite a lot of progress, and can even boot arch linux off usb from either seabios or tianocore.
However, there's a number of odds and ends that need to be fixed which will require a lot of build/flash/boot cycles to wholly figure out. Problem is, the act of booting linux in my board's current state is an absolute chore to perform.
After grub loads the kernel and initramfs, it goes quite smoothly for a while, but at some point, it slows to an absolute crawl/almost halts entirely. I've found that by plugging and unplugging a usb keyboard it causes the boot to progress a bit more each time, until it gets to the point where getty@ttyS0 comes up and sshd is running. Once ssh'd in, I can basically use it just like it was the vendor firmware, a few outstanding issues notwithstanding. This slowing/halting of boot also occurs if I issue a shutdown command and requires the same plug/unplug manipulation to make it actually finish.
It has been suggested to me in a few places that its a lack of a timer irq getting through, but I'm not certain that's actually it but its the only lead I have to go on right now. My current state of code is at: https://review.coreboot.org/c/coreboot/+/77879 and is largely based off the zork chromebooks and the mandolin devel board's code. As I can see no particular handling of timer irqs in either I'm at a loss as to what could be the missing piece of the puzzle.
Once I can build/flash/boot a bit faster I'm almost certain I can get the usb ports coming off the x370 chipset working (pcie and sata both appear to be fully functional as of this moment, more testing required) but as mentioned, it is *such a painful process* to actually boot linux on it in its current state.
Any suggestions or solutions greatly appreciated.
Thank you for your message and your detailed description. It’d be great, if you could also attach the coreboot and Linux log messages.
Find attached cbmem.log and dmesg.log. The latter is not as clean or complete as previous boots, as for some reason the archlinux on the usb is entering emergency shell (prolly needs some fixing due to the occasional need to hard power off the machine in question).
Kind regards,
Paul
Hi Marty,
Dne 26. 10. 23 v 12:29 Marty E. Plummer via coreboot napsal(a):
It has been suggested to me in a few places that its a lack of a timer irq getting through, but I'm not certain that's actually it but its the only lead I have to go on right now. My current state of code is at: https://review.coreboot.org/c/coreboot/+/77879 and is largely based off the zork chromebooks and the mandolin devel board's code. As I can see no particular handling of timer irqs in either I'm at a loss as to what could be the missing piece of the puzzle.
I think it might be actually IRQ problem. Suppose your SATA is on wrong IRQ and on same IRQ there is also USB. Plugging and unplugging will generate the "needed" IRQs to unstuck. This might be true for any other desparate driver stucking the system on missing IRQ.
The inird+kernel are loaded via grub so they dont need to IRQ, but then block device is needed to load all the drivers and run programs so this is why it gets slower and slower. as the other drivers which kept it running are not generating the interrupts too often...
Any suggestions or solutions greatly appreciated.
can you provide cat /proc/interrupts (as root) from your system as well?
You can try to boot with "irqpoll" option and see what happens. Maybe it will work fine and then it is really IRQ issue.
What you can also do is to boot with init=/bin/bash which will drop you to the initrd minimal system (busybox) and there you can rmmod/modprobe drivers and see if SATA actually works without them. You can try reloading them to see who what is causing your issue.
It might be you are using MSI IRQ for AHCI, and this theory is wrong and problem is elsewhere. Maybe some other driver is waiting for IRQ and causes same issue... try to blacklist as much as possible (for full system boot) or try above.
Can you confirm that your rootfs is on AHCI/SATA and no NVMEs are in play?
Thanks, Rudolf
On Fri, Oct 27, 2023 at 11:44:41AM +0200, Rudolf Marek wrote:
Hi Marty,
Dne 26. 10. 23 v 12:29 Marty E. Plummer via coreboot napsal(a):
It has been suggested to me in a few places that its a lack of a timer irq getting through, but I'm not certain that's actually it but its the only lead I have to go on right now. My current state of code is at: https://review.coreboot.org/c/coreboot/+/77879 and is largely based off the zork chromebooks and the mandolin devel board's code. As I can see no particular handling of timer irqs in either I'm at a loss as to what could be the missing piece of the puzzle.
I think it might be actually IRQ problem. Suppose your SATA is on wrong IRQ and on same IRQ there is also USB. Plugging and unplugging will generate the "needed" IRQs to unstuck. This might be true for any other desparate driver stucking the system on missing IRQ.
The inird+kernel are loaded via grub so they dont need to IRQ, but then block device is needed to load all the drivers and run programs so this is why it gets slower and slower. as the other drivers which kept it running are not generating the interrupts too often...
Any suggestions or solutions greatly appreciated.
can you provide cat /proc/interrupts (as root) from your system as well?
You can try to boot with "irqpoll" option and see what happens. Maybe it will work fine and then it is really IRQ issue.
What you can also do is to boot with init=/bin/bash which will drop you to the initrd minimal system (busybox) and there you can rmmod/modprobe drivers and see if SATA actually works without them. You can try reloading them to see who what is causing your issue.
Forget what happened but I did try this at some point.
It might be you are using MSI IRQ for AHCI, and this theory is wrong and problem is elsewhere. Maybe some other driver is waiting for IRQ and causes same issue... try to blacklist as much as possible (for full system boot) or try above.
Can you confirm that your rootfs is on AHCI/SATA and no NVMEs are in play?
Actually neither. Rootfs is on a USB.
Thanks, Rudolf
coreboot mailing list -- coreboot@coreboot.org To unsubscribe send an email to coreboot-leave@coreboot.org
On Fri, Oct 27, 2023 at 11:44:41AM +0200, Rudolf Marek wrote:
Hi Marty,
Dne 26. 10. 23 v 12:29 Marty E. Plummer via coreboot napsal(a):
It has been suggested to me in a few places that its a lack of a timer irq getting through, but I'm not certain that's actually it but its the only lead I have to go on right now. My current state of code is at: https://review.coreboot.org/c/coreboot/+/77879 and is largely based off the zork chromebooks and the mandolin devel board's code. As I can see no particular handling of timer irqs in either I'm at a loss as to what could be the missing piece of the puzzle.
I think it might be actually IRQ problem. Suppose your SATA is on wrong IRQ and on same IRQ there is also USB. Plugging and unplugging will generate the "needed" IRQs to unstuck. This might be true for any other desparate driver stucking the system on missing IRQ.
The inird+kernel are loaded via grub so they dont need to IRQ, but then block device is needed to load all the drivers and run programs so this is why it gets slower and slower. as the other drivers which kept it running are not generating the interrupts too often...
Any suggestions or solutions greatly appreciated.
can you provide cat /proc/interrupts (as root) from your system as well?
You can try to boot with "irqpoll" option and see what happens. Maybe it will work fine and then it is really IRQ issue.
What you can also do is to boot with init=/bin/bash which will drop you to the initrd minimal system (busybox) and there you can rmmod/modprobe drivers and see if SATA actually works without them. You can try reloading them to see who what is causing your issue.
It might be you are using MSI IRQ for AHCI, and this theory is wrong and problem is elsewhere. Maybe some other driver is waiting for IRQ and causes same issue... try to blacklist as much as possible (for full system boot) or try above.
Can you confirm that your rootfs is on AHCI/SATA and no NVMEs are in play?
Thanks, Rudolf
Find attacheched interrupts and dmesg; this time it didn't overflow the log buffer so yeah. However you should still refer to the original one I attached so you get an idea of what I see on serial and how many times I have to plug/replug a usb device to 'get in'.
coreboot mailing list -- coreboot@coreboot.org To unsubscribe send an email to coreboot-leave@coreboot.org
Oh, and here is dmesg and interrupts from stock firmware.
Fail. Here's the actual dmesg and interrupts from stock firmware.
Hi Marty,
Dne 27. 10. 23 v 13:26 Marty E. Plummer via coreboot napsal(a):
Fail. Here's the actual dmesg and interrupts from stock firmware.
Sorry still had no time to look more into that. But the listings confirm that you mostly use MSI-X, so IRQ sharing/routing cannot be your problem. Can you please try to run linux without C states and without s2idle? Also please check errata sheet, maybe some erratum could cause that IRQ does not wakeup processor, which could bit fit your symptoms.
Try idle=poll mem_sleep_default=deep
Thanks, Rudolf
On Sun, Oct 29, 2023 at 04:26:01PM +0100, Rudolf Marek wrote:
Hi Marty,
Dne 27. 10. 23 v 13:26 Marty E. Plummer via coreboot napsal(a):
Fail. Here's the actual dmesg and interrupts from stock firmware.
Sorry still had no time to look more into that. But the listings confirm that you mostly use MSI-X, so IRQ sharing/routing cannot be your problem. Can you please try to run linux without C states and without s2idle?
Is this request the same as the kernel command line options given below or something different? Forgive me, I've never had to do anything *too* funky with cmdline so I'm not familiar with all the option.
Also please check errata sheet, maybe some erratum could cause that IRQ does not wakeup processor, which could bit fit your symptoms.
Will do as soon as I find out where I can obtain said sheet; unfortunately most of the public facing pprs amd provides seem to be geared toward mobile/fp5/similar instead of desktop am4 procs.
Try idle=poll mem_sleep_default=deep
As above, does this turn off cstates and s2idle?
Thanks, Rudolf _______________________________________________ coreboot mailing list -- coreboot@coreboot.org To unsubscribe send an email to coreboot-leave@coreboot.org
Hi Marty,
Dne 29. 10. 23 v 23:38 Marty E. Plummer via coreboot napsal(a):
Is this request the same as the kernel command line options given below or something different? Forgive me, I've never had to do anything *too*
Well I don't know if coreboot can disable that. But basically you can try to change command line as noted below.
funky with cmdline so I'm not familiar with all the option.
Also please check errata sheet, maybe some erratum could cause that IRQ does not wakeup processor, which could bit fit your symptoms.
Will do as soon as I find out where I can obtain said sheet; unfortunately most of the public facing pprs amd provides seem to be geared toward mobile/fp5/similar instead of desktop am4 procs.
Try idle=poll mem_sleep_default=deep
As above, does this turn off cstates and s2idle?
The idle=poll avoids Cx states and mem_sleep_default hopefully avoids s2idle.
Thanks, Rudolf
On Mon, Oct 30, 2023 at 07:24:14AM +0100, Rudolf Marek wrote:
Hi Marty,
Dne 29. 10. 23 v 23:38 Marty E. Plummer via coreboot napsal(a):
Is this request the same as the kernel command line options given below or something different? Forgive me, I've never had to do anything *too*
Well I don't know if coreboot can disable that. But basically you can try to change command line as noted below.
Wasn't expecting coreboot to do it, just stating I don't know many of the more esoteric kernel cmdline options.
As above, does this turn off cstates and s2idle?
The idle=poll avoids Cx states and mem_sleep_default hopefully avoids s2idle.
No apparent effect, either with or without irqpoll. Waiting a significant amount of time does allow boot to complete with irqpoll, not sure if I said this before.
Also, gave sata booting (note: the sata ports on this mobo come off the x370 chipset, unless you have one of those m2 sata drives, which you can, on stock firmware, use in either of the two m2 ports [one off the cpu, the other off the chipset]) a try, and a mistake I made while prepping the disk (installing and enabling sshd) reminded me of another symptom I may not have mentioned on the ml.
After doing the usb plug/replug dance long enough for getty@ttyS0 to spin up, I *can* use it, but the problem is, I have to type what I want (won't show output), hit enter, unplug/replug the usb, and then the command (or username/password) 'goes through.'
Linked are the results of the following script collection: https://github.com/martinlroth/public_scripts https://files.catbox.moe/irqqcq.gz (too big to mail)
In addition I've used systemd-analyze blame and plot to show the 'slow parts' (dev-sda3.device takes over 51 minutes; a reminder this is a partition on 0781:5581 SanDisk Corp. Ultra, plugged into one of the from-cpu USB 3.x ports), but using irqpoll *does work* but the boot is *incredibly slow* all the same.
I've also attached a hopefully clean cbmem and dmesg collected from userspace with this irqpoll boot.