Hi, All, I am debugging the AMD Picasso board. When Linux kernel boots, there is some error message in dmesg. do_IRQ: 1.55 No irq handler for vector
The kernel can still boot. What does this message mean? Can I just ignore this message?
Completed dmesg is attached.
Zheng
[ 0.032000] ... value mask: 0000ffffffffffff [ 0.032000] ... max period: 00007fffffffffff [ 0.032000] ... fixed-purpose events: 0 [ 0.032000] ... event mask: 000000000000003f [ 0.032000] Hierarchical SRCU implementation. [ 0.032000] smp: Bringing up secondary CPUs ... [ 0.032000] x86: Booting SMP configuration: [ 0.032000] .... node #0, CPUs: #1 [ 0.004000] do_IRQ: 1.55 No irq handler for vector [ 0.032000] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter. [ 0.032000] #2 [ 0.004000] do_IRQ: 2.55 No irq handler for vector [ 0.032070] #3 [ 0.004000] do_IRQ: 3.55 No irq handler for vector [ 0.036014] smp: Brought up 1 node, 4 CPUs [ 0.036014] smpboot: Max logical packages: 1 [ 0.036014] smpboot: Total of 4 processors activated (20761.52 BogoMIPS) [ 0.036664] devtmpfs: initialized [
Hi Zheng,
On 03.05.20 17:27, Zheng Bao wrote:
I am debugging the AMD Picasso board. When Linux kernel boots, there is some error message in dmesg. do_IRQ: 1.55 No irq handler for vector
The kernel can still boot. What does this message mean? Can I just ignore this message?
well, I'm worried. Even if it probably breaks nothing, it seems the APs are not in a state that Linux expects when it starts them up.
1.55 means interrupt vector 55 on CPU#1. This is in Linux' legacy interrupt range, should be IRQ 7 (offset by 48).
That's where my Linux knowledge ended but I just had a quick look: In `arch/x86/kernel/smpboot.c` we have start_secondary() which runs on the AP. It calls:
load_current_idt(); ... lapic_online(); ... local_irq_enable();
A comment above the declaration of load_current_idt() indicates that we don't expect IRQs yet (before local_irq_enable() via `sti` I guess). That the kernel complains "No irq handler for vector" likely means that an interrupt is triggered before lapic_online() registered the handler.
So, for me there are two mysteries:
1. Why is IRQ 7 triggerred?
2. Why does the AP process interrupts before `sti`? (if my assessment above is correct).
Did you run any payload between coreboot and the kernel?
Nico
PS. I didn't thought you'd ask without googling first, there are mailing list threads about the issue already. Let's see what they say...
Which board are you testing with? Picasso support is still undergoing heavy development and we are working on getting parity with the chromiumos fork: https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/th...
On Sun, May 3, 2020 at 9:28 AM Zheng Bao fishbaoz@hotmail.com wrote:
Hi, All, I am debugging the AMD Picasso board. When Linux kernel boots, there is some error message in dmesg. do_IRQ: 1.55 No irq handler for vector
The kernel can still boot. What does this message mean? Can I just ignore this message?
Completed dmesg is attached.
Zheng
[ 0.032000] ... value mask: 0000ffffffffffff [ 0.032000] ... max period: 00007fffffffffff [ 0.032000] ... fixed-purpose events: 0 [ 0.032000] ... event mask: 000000000000003f [ 0.032000] Hierarchical SRCU implementation. [ 0.032000] smp: Bringing up secondary CPUs ... [ 0.032000] x86: Booting SMP configuration: [ 0.032000] .... node #0, CPUs: #1 [ 0.004000] do_IRQ: 1.55 No irq handler for vector [ 0.032000] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter. [ 0.032000] #2 [ 0.004000] do_IRQ: 2.55 No irq handler for vector [ 0.032070] #3 [ 0.004000] do_IRQ: 3.55 No irq handler for vector [ 0.036014] smp: Brought up 1 node, 4 CPUs [ 0.036014] smpboot: Max logical packages: 1 [ 0.036014] smpboot: Total of 4 processors activated (20761.52 BogoMIPS) [ 0.036664] devtmpfs: initialized [ _______________________________________________ coreboot mailing list -- coreboot@coreboot.org To unsubscribe send an email to coreboot-leave@coreboot.org
The Payload I run between Coreboot and Linux is SeaBIOS. Do you mean SeaBIOS should disable interrupt of all APs? I check the mail list and found this issue was raised before, but actually by my colleague. We worked together and have not found the final solution.
Zheng
________________________________ From: Nico Huber nico.h@gmx.de Sent: Sunday, May 3, 2020 4:28 PM To: Zheng Bao fishbaoz@hotmail.com; coreboot coreboot@coreboot.org Subject: Re: [coreboot] Linux kernel says "do_IRQ: 1.55 No irq handler for vector"
Hi Zheng,
On 03.05.20 17:27, Zheng Bao wrote:
I am debugging the AMD Picasso board. When Linux kernel boots, there is some error message in dmesg. do_IRQ: 1.55 No irq handler for vector
The kernel can still boot. What does this message mean? Can I just ignore this message?
well, I'm worried. Even if it probably breaks nothing, it seems the APs are not in a state that Linux expects when it starts them up.
1.55 means interrupt vector 55 on CPU#1. This is in Linux' legacy interrupt range, should be IRQ 7 (offset by 48).
That's where my Linux knowledge ended but I just had a quick look: In `arch/x86/kernel/smpboot.c` we have start_secondary() which runs on the AP. It calls:
load_current_idt(); ... lapic_online(); ... local_irq_enable();
A comment above the declaration of load_current_idt() indicates that we don't expect IRQs yet (before local_irq_enable() via `sti` I guess). That the kernel complains "No irq handler for vector" likely means that an interrupt is triggered before lapic_online() registered the handler.
So, for me there are two mysteries:
1. Why is IRQ 7 triggerred?
2. Why does the AP process interrupts before `sti`? (if my assessment above is correct).
Did you run any payload between coreboot and the kernel?
Nico
PS. I didn't thought you'd ask without googling first, there are mailing list threads about the issue already. Let's see what they say...
Hi all,
On 03.05.2020 18:28, Nico Huber wrote:
Hi Zheng,
On 03.05.20 17:27, Zheng Bao wrote:
I am debugging the AMD Picasso board. When Linux kernel boots, there is some error message in dmesg. do_IRQ: 1.55 No irq handler for vector
We've noticed the same on PC Engines apu2, an older AMD board.
So, for me there are two mysteries:
- Why is IRQ 7 triggerred?
IRQ 7 is a legacy spurious interrupt vector. It may be caused by the timer tick interrupt, see section 2.4.8.1.10 of [1] or appropriate document for other platforms.
However, from what we have observed, this happens more often than 50% of time, so there may be another interrupt involved, one that is deasserted e.g. by INIT signal.
- Why does the AP process interrupts before `sti`? (if my assessment above is correct).
Did you run any payload between coreboot and the kernel?
APs are enabled by AGESA, the interrupt might be latched since then and not because of whatever payload was doing.
[1] https://www.amd.com/system/files/TechDocs/52740_16h_Models_30h-3Fh_BKDG.pdf
not setting Extint can fix this IRQ issue.
Is it a common problem or it is a AMD specific one?
Zheng
diff --git a/src/cpu/x86/lapic/lapic.c b/src/cpu/x86/lapic/lapic.c index 91b0fcd5ba..0dff625eb7 100644 --- a/src/cpu/x86/lapic/lapic.c +++ b/src/cpu/x86/lapic/lapic.c @@ -29,6 +29,8 @@ void do_lapic_init(void) lapic_write_around(LAPIC_SPIV, (lapic_read_around(LAPIC_SPIV) & ~(LAPIC_VECTOR_MASK)) | LAPIC_SPIV_ENABLE); + + if (lapicid() == 0) lapic_write_around(LAPIC_LVT0, (lapic_read_around(LAPIC_LVT0) & ~(LAPIC_LVT_MASKED | LAPIC_LVT_LEVEL_TRIGGER | @@ -38,6 +40,16 @@ void do_lapic_init(void) | (LAPIC_LVT_REMOTE_IRR | LAPIC_SEND_PENDING | LAPIC_DELIVERY_MODE_EXTINT) ); + else + lapic_write_around(LAPIC_LVT0, + (lapic_read_around(LAPIC_LVT0) & + ~(LAPIC_LVT_MASKED | LAPIC_LVT_LEVEL_TRIGGER | + LAPIC_LVT_REMOTE_IRR | LAPIC_INPUT_POLARITY | + LAPIC_SEND_PENDING | LAPIC_LVT_RESERVED_1 | + LAPIC_DELIVERY_MODE_MASK)) + | (LAPIC_LVT_REMOTE_IRR | LAPIC_SEND_PENDING) + ); + lapic_write_around(LAPIC_LVT1, (lapic_read_around(LAPIC_LVT1) & ~(LAPIC_LVT_MASKED | LAPIC_LVT_LEVEL_TRIGGER |
________________________________ From: Zheng Bao fishbaoz@hotmail.com Sent: Wednesday, May 6, 2020 8:14 AM To: Nico Huber nico.h@gmx.de; coreboot coreboot@coreboot.org Subject: [coreboot] Re: Linux kernel says "do_IRQ: 1.55 No irq handler for vector"
The Payload I run between Coreboot and Linux is SeaBIOS. Do you mean SeaBIOS should disable interrupt of all APs? I check the mail list and found this issue was raised before, but actually by my colleague. We worked together and have not found the final solution.
Zheng
________________________________ From: Nico Huber nico.h@gmx.de Sent: Sunday, May 3, 2020 4:28 PM To: Zheng Bao fishbaoz@hotmail.com; coreboot coreboot@coreboot.org Subject: Re: [coreboot] Linux kernel says "do_IRQ: 1.55 No irq handler for vector"
Hi Zheng,
On 03.05.20 17:27, Zheng Bao wrote:
I am debugging the AMD Picasso board. When Linux kernel boots, there is some error message in dmesg. do_IRQ: 1.55 No irq handler for vector
The kernel can still boot. What does this message mean? Can I just ignore this message?
well, I'm worried. Even if it probably breaks nothing, it seems the APs are not in a state that Linux expects when it starts them up.
1.55 means interrupt vector 55 on CPU#1. This is in Linux' legacy interrupt range, should be IRQ 7 (offset by 48).
That's where my Linux knowledge ended but I just had a quick look: In `arch/x86/kernel/smpboot.c` we have start_secondary() which runs on the AP. It calls:
load_current_idt(); ... lapic_online(); ... local_irq_enable();
A comment above the declaration of load_current_idt() indicates that we don't expect IRQs yet (before local_irq_enable() via `sti` I guess). That the kernel complains "No irq handler for vector" likely means that an interrupt is triggered before lapic_online() registered the handler.
So, for me there are two mysteries:
1. Why is IRQ 7 triggerred?
2. Why does the AP process interrupts before `sti`? (if my assessment above is correct).
Did you run any payload between coreboot and the kernel?
Nico
PS. I didn't thought you'd ask without googling first, there are mailing list threads about the issue already. Let's see what they say...
It seems to be a common problem for AMD: I've seen it on fam15h Richland boards. This problem might be possible to fix, I just never had the time to investigate and solve it. Aside from these angry messages at dmesg, don't know if there are any extra side effects.
On Wed, May 13, 2020 at 4:54 AM Zheng Bao fishbaoz@hotmail.com wrote:
not setting Extint can fix this IRQ issue.
Is it a common problem or it is a AMD specific one?
Zheng
diff --git a/src/cpu/x86/lapic/lapic.c b/src/cpu/x86/lapic/lapic.c index 91b0fcd5ba..0dff625eb7 100644 --- a/src/cpu/x86/lapic/lapic.c +++ b/src/cpu/x86/lapic/lapic.c @@ -29,6 +29,8 @@ void do_lapic_init(void) lapic_write_around(LAPIC_SPIV, (lapic_read_around(LAPIC_SPIV) & ~(LAPIC_VECTOR_MASK)) | LAPIC_SPIV_ENABLE);
- if (lapicid() == 0) lapic_write_around(LAPIC_LVT0, (lapic_read_around(LAPIC_LVT0) & ~(LAPIC_LVT_MASKED | LAPIC_LVT_LEVEL_TRIGGER |
@@ -38,6 +40,16 @@ void do_lapic_init(void) | (LAPIC_LVT_REMOTE_IRR | LAPIC_SEND_PENDING | LAPIC_DELIVERY_MODE_EXTINT) );
- else
- lapic_write_around(LAPIC_LVT0,
- (lapic_read_around(LAPIC_LVT0) &
- ~(LAPIC_LVT_MASKED | LAPIC_LVT_LEVEL_TRIGGER |
- LAPIC_LVT_REMOTE_IRR | LAPIC_INPUT_POLARITY |
- LAPIC_SEND_PENDING | LAPIC_LVT_RESERVED_1 |
- LAPIC_DELIVERY_MODE_MASK))
- | (LAPIC_LVT_REMOTE_IRR | LAPIC_SEND_PENDING)
- );
- lapic_write_around(LAPIC_LVT1, (lapic_read_around(LAPIC_LVT1) & ~(LAPIC_LVT_MASKED | LAPIC_LVT_LEVEL_TRIGGER |
From: Zheng Bao fishbaoz@hotmail.com Sent: Wednesday, May 6, 2020 8:14 AM To: Nico Huber nico.h@gmx.de; coreboot coreboot@coreboot.org Subject: [coreboot] Re: Linux kernel says "do_IRQ: 1.55 No irq handler for vector"
The Payload I run between Coreboot and Linux is SeaBIOS. Do you mean SeaBIOS should disable interrupt of all APs? I check the mail list and found this issue was raised before, but actually by my colleague. We worked together and have not found the final solution.
Zheng
From: Nico Huber nico.h@gmx.de Sent: Sunday, May 3, 2020 4:28 PM To: Zheng Bao fishbaoz@hotmail.com; coreboot coreboot@coreboot.org Subject: Re: [coreboot] Linux kernel says "do_IRQ: 1.55 No irq handler for vector"
Hi Zheng,
On 03.05.20 17:27, Zheng Bao wrote:
I am debugging the AMD Picasso board. When Linux kernel boots, there is some error message in dmesg. do_IRQ: 1.55 No irq handler for vector
The kernel can still boot. What does this message mean? Can I just ignore this message?
well, I'm worried. Even if it probably breaks nothing, it seems the APs are not in a state that Linux expects when it starts them up.
1.55 means interrupt vector 55 on CPU#1. This is in Linux' legacy interrupt range, should be IRQ 7 (offset by 48).
That's where my Linux knowledge ended but I just had a quick look: In `arch/x86/kernel/smpboot.c` we have start_secondary() which runs on the AP. It calls:
load_current_idt(); ... lapic_online(); ... local_irq_enable();
A comment above the declaration of load_current_idt() indicates that we don't expect IRQs yet (before local_irq_enable() via `sti` I guess). That the kernel complains "No irq handler for vector" likely means that an interrupt is triggered before lapic_online() registered the handler.
So, for me there are two mysteries:
Why is IRQ 7 triggerred?
Why does the AP process interrupts before `sti`? (if my assessment above is correct).
Did you run any payload between coreboot and the kernel?
Nico
PS. I didn't thought you'd ask without googling first, there are mailing list threads about the issue already. Let's see what they say... _______________________________________________ coreboot mailing list -- coreboot@coreboot.org To unsubscribe send an email to coreboot-leave@coreboot.org