I apologize that my questions are not about coreboot and I am sort of hijacking the forum. But I think that this is a place where people with good knowledge of the hardware can be found. Maybe my issue will be of interest to you as well. And I hope to get some help. Maybe even some secrets shared :-)
First, the hardware that I am talking about: it's a typical consumer system with a Family 10h AMD processor, SB700 southbridge and 780G northbridge: http://www.gigabyte.com/products/product-page.aspx?pid=3004#sp
What I want to achieve is to get an interrupt generated by a SuperIO chip (external to the chipset) delivered to a CPU as a NMI. The interrupt is IRQ3 connected to pin 3 of the IO-APIC and everything just works if I forget about NMI: device -> IO-APIC pin 3 -> (fixed mode interrupt message) -> Local APIC -> interrupt handler at the configured vector. NMI is what makes it interesting.
So, the first thing I tried is simply to set the NMI delivery mode for the pin. Unfortunately, that does not work, the system gets reset as soon as the interrupt is generated. So, my first question is: can that be made to work at all? Perhaps, there are some registers that need to be correctly programmed in the chipset or in the processor for that to work. Or maybe it can not work at all. For example, for Intel ICH9R southbridge it is documented that SMI, NMI and INIT must not be used. I couldn't find any such restriction explicitly stated for the AMD chipsets.
So, I decided to not give up and to try to use the legacy interrupt mode to get what I want. I think that that's how Linux NMI watchdog driver used to work. So, I programmed LINT0 and LINT1 for NMI delivery mode (on all cores, all two of them), enabled legacy PIC (I guess that it's built into the chipset) and made sure that the interrupt is unmasked. But absolutely nothing happened when the interrupt is generated.
From this I concluded that the PIC is not connected to the CPUs.
Just to be sure that I didn't make any mistake with the PIC programming I decided to check that the only other possibility worked, that is, that the PIC is connected to pin 0 of the IO-APIC. By default the pin was masked (by the OS, I guess), so I programmed it to ExtINT delivery mode with the BSP as the physical destination (also edge-triggered, active high). What I observed next was a bit surprising. Every time I generated the interrupt the target CPU would set bit 6 (received illegal vector) in its Local APIC's error status register. I concluded that the interrupt got routed from the PIC to the pin 0 of the IO-APIC, but then there was a problem delivering the ExtINT message.
I looked for mentions of ExtINT in the Family 10 BKDG and stumbled upon the LintEn bit in F0x68 register. The bit is described as such:
LintEn: local interrupt conversion enable. Read-write. 1=Enables the conversion of broadcast ExtInt and NMI interrupt requests to LINT0 and LINT1 local interrupts, respectively, before deliver- ing to the local APIC. This conversion only takes place if the local APIC is hardware enabled. LINT0 and LINT1 are controlled by APIC350 and APIC360. 0=ExtInt/NMI interrupts delivered unchanged.
The bit was unset. I decided to set it and see what happens. Much to my surprise I got NMIs delivered to both cores. Then I remembered that I still had NMI delivery mode set for LINT0 on both of them. And this happened despite the destination being programmed to zero (the BSP's APIC ID), not the broadcast address. Another weird thing was that the 'received illegal vector' bit was still getting set.
So, I got what I wanted but in a complex configuration: device -> PIC -> IO-APIC pin 0 -> (ExtINT) -> (converted to LINT0) -> NMI
I decided to simplify that scheme a little bit to this: device -> IO-APIC pin 3 -> (ExtINT) -> (converted to LINT0) -> NMI so that I didn't have to enable the PIC. That is, I reprogrammed the pin 3 to ExtINT delivery mode, masked pin 0 and disabled PIC. But that didn't work. Regardless of how LintEn was set I was getting just the bad vector errors without NMI.
Then I decided to incrementally go back to working configuration. I enabled PIC - still no NMI. I unmasked IRQ3 - got NMI! Pin 0 was still masked. So, this working configuration was slightly different from the original working configuration: device -> PIC [IRQ3 unmasked] -> IO-APIC pin 0 -> (Masked) -> IO-APIC pin 3 -> (ExtINT) -> (converted to LINT0) -> NMI So, the interrupt was going through the APIC route, but the legacy stuff had to be enabled for ExtINT to be converted to LINT0.
I think that this is an interesting discovery: PIC's configuration affects how the IO-APIC communicates to the Local APICs.
I went over the BKDG and SB7x0 documentation (RRG, RPG) and only PCI_Reg 62h of device 20 function 0 in SB7x0 caught my eye. Namely, K8_INTR, MT3_Set and MT3_Auto bits. They all are about K8 INTR [NMI] message. On my system the register is set to 0x24, that is K8_INTR is set, but MT3_Set and MT3_Auto are not... In the coreboot source code I see that the register is set up with exactly the same value in src/southbridge/amd/sb700/early_setup.c. And, for what it's worth, bit 5 (0x20 mask) is documented as "reserved".
I would greatly appreciate if anyone could tell me how to configure the system to get NMIs working in the simplest possible fashion. That is, without enabling PIC. And preferably without requiring the message to LINTn conversion with LintEn.
I will be happy with just magic setting. But if there will be an explanation of what the settings do and how interrupt routing works within the chipset and between the chipset and the CPU, then that would be terrific!
I hope that this was not a boring read. Thank you!
Hi Andriy,
First, the hardware that I am talking about: it's a typical consumer system with a Family 10h AMD processor, SB700 southbridge and 780G northbridge: http://www.gigabyte.com/products/product-page.aspx?pid=3004#sp
What superIO is it?
What I want to achieve is to get an interrupt generated by a SuperIO chip (external to the chipset) delivered to a CPU as a NMI.
You could set a bit which enables NMI when IOCHKCK# is asserted. But, I guess the SuperIO wont generate it for you - unless you have own gadget.
So, the first thing I tried is simply to set the NMI delivery mode for the pin. Unfortunately, that does not work, the system gets reset as soon as the interrupt is generated. So, my first question is: can that be made to work at all?
Did you set vector to 0x0 ? Maybe it did not like the rest of the fields?
So, I decided to not give up and to try to use the legacy interrupt mode to get what I want. I think that that's how Linux NMI watchdog driver used to work. So, I programmed LINT0 and LINT1 for NMI delivery mode (on all cores, all two of them), enabled legacy PIC (I guess that it's built into the chipset) and made sure that the interrupt is unmasked. But absolutely nothing happened when the interrupt is generated. From this I concluded that the PIC is not connected to the CPUs.
The PIC is "connected" to CPU via LINT0 if set to extINT. Or through the entry in IOAPIC.
I think that this is an interesting discovery: PIC's configuration affects how the IO-APIC communicates to the Local APICs.
This has to do about Hypertransportspecification. Please have a look to
Table 103. x86 Interrupt Request Packet Format (HT spec version 1.1) There is this MT3 explained.
I went over the BKDG and SB7x0 documentation (RRG, RPG) and only PCI_Reg 62h of device 20 function 0 in SB7x0 caught my eye. Namely, K8_INTR, MT3_Set and MT3_Auto bits. They all are about K8 INTR [NMI] message. On my system the register is set to 0x24, that is K8_INTR is set, but MT3_Set and MT3_Auto are not...
So, this just helps to generate right format to HT bus... maybe the CPU is confused... and you will need to change it. Now it should be more clear what the MT3 means.
In the coreboot source code I see that the register is set up with exactly the same value in src/southbridge/amd/sb700/early_setup.c. And, for what it's worth, bit 5 (0x20 mask) is documented as "reserved".
Could be that it is something unrelated - Maybe leftover from SB600, where the bit is documented. It has to do something with SMI on USB.
I hope that this was not a boring read.
Nope, it is was quite interesting. Why you need to trigger NMI by superIO? Maybe there is other solution. I think at least some silicon revisions of SB700 had some magic bit which was triggering NMI if written.
You can perhaps generate NMI using MSI/MSI-X or HPET (i tried with this)
Thanks Rudolf
On 08/10/2016 20:35, Rudolf Marek wrote:
Hi Andriy,
First, the hardware that I am talking about: it's a typical consumer system with a Family 10h AMD processor, SB700 southbridge and 780G northbridge: http://www.gigabyte.com/products/product-page.aspx?pid=3004#sp
What superIO is it?
On that system it's ITE IT8718F. On the other system that I tested (see my other post) it's IT8721F.
What I want to achieve is to get an interrupt generated by a SuperIO chip (external to the chipset) delivered to a CPU as a NMI.
You could set a bit which enables NMI when IOCHKCK# is asserted. But, I guess the SuperIO wont generate it for you - unless you have own gadget.
Yes. Also, to answer another of your questions, I want to have an interrupt generated by the SuperIO's watchdog timer be delivered as an NMI.
So, the first thing I tried is simply to set the NMI delivery mode for the pin. Unfortunately, that does not work, the system gets reset as soon as the interrupt is generated. So, my first question is: can that be made to work at all?
Did you set vector to 0x0 ? Maybe it did not like the rest of the fields?
I double-checked, all the bits seem to be correct as only the NMI mode bit has to be set, all other have to be or can be zeroes: 0x0000000000000400 (using 64-bit format for two 32-bit registers of the redirection table entry).
So, I decided to not give up and to try to use the legacy interrupt mode to get what I want. I think that that's how Linux NMI watchdog driver used to work. So, I programmed LINT0 and LINT1 for NMI delivery mode (on all cores, all two of them), enabled legacy PIC (I guess that it's built into the chipset) and made sure that the interrupt is unmasked. But absolutely nothing happened when the interrupt is generated. From this I concluded that the PIC is not connected to the CPUs.
The PIC is "connected" to CPU via LINT0 if set to extINT. Or through the entry in IOAPIC.
Well, I was talking about the physical level (or something close to it). Anyway, as it turned out I was wrong and the PIC is actually connected both to the LINT0 pin of the CPU and pin 0 of the I/O APIC.
I think that this is an interesting discovery: PIC's configuration affects how the IO-APIC communicates to the Local APICs.
This has to do about Hypertransportspecification. Please have a look to
Table 103. x86 Interrupt Request Packet Format (HT spec version 1.1) There is this MT3 explained.
Thank you for the pointer! I will have a look.
I went over the BKDG and SB7x0 documentation (RRG, RPG) and only PCI_Reg 62h of device 20 function 0 in SB7x0 caught my eye. Namely, K8_INTR, MT3_Set and MT3_Auto bits. They all are about K8 INTR [NMI] message. On my system the register is set to 0x24, that is K8_INTR is set, but MT3_Set and MT3_Auto are not...
So, this just helps to generate right format to HT bus... maybe the CPU is confused... and you will need to change it. Now it should be more clear what the MT3 means.
I will try to map those to the HT protocol details.
In the coreboot source code I see that the register is set up with exactly the same value in src/southbridge/amd/sb700/early_setup.c. And, for what it's worth, bit 5 (0x20 mask) is documented as "reserved".
Could be that it is something unrelated - Maybe leftover from SB600, where the bit is documented. It has to do something with SMI on USB.
Okay.
I hope that this was not a boring read.
Nope, it is was quite interesting. Why you need to trigger NMI by superIO? Maybe there is other solution. I think at least some silicon revisions of SB700 had some magic bit which was triggering NMI if written.
As I wrote above, I want to create another variant of an "NMI watchdog". Watchdog timers in ITE SuperIOs are able to either reset a system or to generate an interrupt. I want to use the latter option to get the NMI.
You can perhaps generate NMI using MSI/MSI-X or HPET (i tried with this)
Actually, I tried that with SB700 and SB850 HPETs. I configured a timer for an FSB (=MSI, I guess) interrupt mode and set the delivery mode in the same fashion as I would for an MSI interrupt. The results were exactly the same as what I am getting when setting IO-APIC redirection mode to NMI. If you were able to get this stuff to work on a similar hardware, then I would appreciate your advice.
Thank you very much!
Hi again
You can perhaps generate NMI using MSI/MSI-X or HPET (i tried with this)
I did the HPET NMI generator on Intel PCH, it works fine. I just fill the MSI addr/data in a way that it was delivering NMI to certain CPU - physical delivery to a CPU with a certain ID.
If this does not work on AMD, perhaps there is some problem with chipset configuration. Please check the HT specs and try to sort out the MT3 setup.
Thanks Rudolf
On 08/10/2016 23:57, Rudolf Marek wrote:
I did the HPET NMI generator on Intel PCH, it works fine. I just fill the MSI addr/data in a way that it was delivering NMI to certain CPU - physical delivery to a CPU with a certain ID.
If this does not work on AMD, perhaps there is some problem with chipset configuration. Please check the HT specs and try to sort out the MT3 setup.
I've just got this working on my AMD hardware too. I had to use the same HT encoding for the Delivery Mode / Message Type bits as with IO-APIC.
On 08/10/2016 23:04, Andriy Gapon wrote:
On 08/10/2016 20:35, Rudolf Marek wrote:
Andriy Gapon wrote:
[...]
This has to do about Hypertransportspecification. Please have a look to
Table 103. x86 Interrupt Request Packet Format (HT spec version 1.1) There is this MT3 explained.
Thank you for the pointer! I will have a look.
I went over the BKDG and SB7x0 documentation (RRG, RPG) and only PCI_Reg 62h of device 20 function 0 in SB7x0 caught my eye. Namely, K8_INTR, MT3_Set and MT3_Auto bits. They all are about K8 INTR [NMI] message. On my system the register is set to 0x24, that is K8_INTR is set, but MT3_Set and MT3_Auto are not...
So, this just helps to generate right format to HT bus... maybe the CPU is confused... and you will need to change it. Now it should be more clear what the MT3 means.
I will try to map those to the HT protocol details.
The HyperTransport Specification is very enlightening. Thank you, Rudolf, for setting me on that path! I used version 3.0 found here http://www.hypertransport.org/ht-3-0-link-spec. Especially useful is Appendix F.1 Interrupts, particularly tables 106 and 108.
The document and some experiments allowed me to understand one important thing. Both PIC and APIC interrupts are delivered over HyperTransport. There are no special physical wires for the PIC interrupts.
Figuring out the MT3_Set and MT3_Auto bits was very easy. The bits affect only the PIC interrupts, they have no effect on the I/O APIC interrupts. On the SB700+Athlon system those bits are unset and that means that MT[3] of the HyperTransport messages (Message Type, bit 3) is zero. In turn, that means that the PIC interrupts are delivered as normal ExtINT interrupts as if they were coming from the I/O APIC. That explains why LintEn has the effect on the PIC interrupts. The bits are named slightly differently in the SB800 documentation. They are Mts_set and Mts_auto and they belong to PMIO PM_Reg 08h. But they have exactly the same function. On the SB850+Phenom system Mts_set is 1 and Mts_auto is 0. So, MT[3] is one for the PIC interrupt messages. That means that their type is 'Legacy PIC ExtInt (LINT0)', not the regular ExtInt. That's why LintEn does not affect them.
So, the above explains everything I see with the PIC interrupts, but does not explain the problems with the I/O APIC interrupts. And here I made a wild guess... and it seems to be correct.
I set Delivery Mode in the redirection table to the HyperTransport defined NMI message type, 0_011, and it worked! This is the redirection entry I used: 0xff00000000000300 (note that the destination is set to the broadcast address as required by Table 106).
There is an apparent bug in the AMD southbridges. Despite what the APIC specification says and what Table 108 of the HyperTransport 3.0 specification reiterates, the delivery mode bits are copied as-is into the HyperTransport messages. That's not a problem for fixed, lowest priority and SMI interrupts where the bits are exactly the same. But the APIC NMI mode gets interpreted as INIT which explains the problems I was with it. Also, it seems that APIC ExtINT mode (111) is an undefined value for the HyperTransport message type, but messages with that type get treated as normal interrupts.
Rudolf, thank you very much again! I hope that the findings will help other people too.
On 09/10/2016 13:02, Andriy Gapon wrote:
On 08/10/2016 23:04, Andriy Gapon wrote:
On 08/10/2016 20:35, Rudolf Marek wrote:
Andriy Gapon wrote:
[...]
This has to do about Hypertransportspecification. Please have a look to
Table 103. x86 Interrupt Request Packet Format (HT spec version 1.1) There is this MT3 explained.
Thank you for the pointer! I will have a look.
I went over the BKDG and SB7x0 documentation (RRG, RPG) and only PCI_Reg 62h of device 20 function 0 in SB7x0 caught my eye. Namely, K8_INTR, MT3_Set and MT3_Auto bits. They all are about K8 INTR [NMI] message. On my system the register is set to 0x24, that is K8_INTR is set, but MT3_Set and MT3_Auto are not...
So, this just helps to generate right format to HT bus... maybe the CPU is confused... and you will need to change it. Now it should be more clear what the MT3 means.
I will try to map those to the HT protocol details.
The HyperTransport Specification is very enlightening. Thank you, Rudolf, for setting me on that path! I used version 3.0 found here http://www.hypertransport.org/ht-3-0-link-spec. Especially useful is Appendix F.1 Interrupts, particularly tables 106 and 108.
Apologies, the above references are actually for the HT 1.0 document. In the HT 3.0 document the tables are 150 and 152.
The document and some experiments allowed me to understand one important thing. Both PIC and APIC interrupts are delivered over HyperTransport. There are no special physical wires for the PIC interrupts.
Figuring out the MT3_Set and MT3_Auto bits was very easy. The bits affect only the PIC interrupts, they have no effect on the I/O APIC interrupts. On the SB700+Athlon system those bits are unset and that means that MT[3] of the HyperTransport messages (Message Type, bit 3) is zero. In turn, that means that the PIC interrupts are delivered as normal ExtINT interrupts as if they were coming from the I/O APIC. That explains why LintEn has the effect on the PIC interrupts. The bits are named slightly differently in the SB800 documentation. They are Mts_set and Mts_auto and they belong to PMIO PM_Reg 08h. But they have exactly the same function. On the SB850+Phenom system Mts_set is 1 and Mts_auto is 0. So, MT[3] is one for the PIC interrupt messages. That means that their type is 'Legacy PIC ExtInt (LINT0)', not the regular ExtInt. That's why LintEn does not affect them.
So, the above explains everything I see with the PIC interrupts, but does not explain the problems with the I/O APIC interrupts. And here I made a wild guess... and it seems to be correct.
I set Delivery Mode in the redirection table to the HyperTransport defined NMI message type, 0_011, and it worked! This is the redirection entry I used: 0xff00000000000300 (note that the destination is set to the broadcast address as required by Table 106).
More tests show that individual APIC IDs can also be used in addition to the broadcast ID.
There is an apparent bug in the AMD southbridges. Despite what the APIC specification says and what Table 108 of the HyperTransport 3.0 specification reiterates, the delivery mode bits are copied as-is into the HyperTransport messages. That's not a problem for fixed, lowest priority and SMI interrupts where the bits are exactly the same. But the APIC NMI mode gets interpreted as INIT which explains the problems I was with it. Also, it seems that APIC ExtINT mode (111) is an undefined value for the HyperTransport message type, but messages with that type get treated as normal interrupts.
Rudolf, thank you very much again! I hope that the findings will help other people too.
Hi Andriy,
Thanks for the very detailed emails. It was great I could help.
I set Delivery Mode in the redirection table to the HyperTransport defined NMI message type, 0_011, and it worked!
Great, so there is a bug! Do you have some USB image I can try on more recent AMD system (with Hudson chipset). Or we can try to report this to AMD, but I have read somewhere that the new chipsets are done by Asmedia.
Thanks Rudolf
On 09/10/2016 23:41, Rudolf Marek wrote:
Great, so there is a bug! Do you have some USB image I can try on more recent AMD system (with Hudson chipset). Or we can try to report this to AMD, but I have read somewhere that the new chipsets are done by Asmedia.
I did most of the testing using two small programs, one to examine the IO-APIC redirection table and the other to write its entries: https://people.freebsd.org/~avg/ioapic.c https://people.freebsd.org/~avg/ioapic_wr.c Another thing required is an interrupt that can be reliably triggered in a controllable fashion (like a mouse interrupt or a timer interrupt). In my case I used an interrupt from the watchdog timer.
So, I would just directly re-program a pin corresponding to the interrupt and then trigger the interrupt and see what happens.
An example: $ ioapic_wr 3 0xff00000000000300
Hi Andriy,
An example: $ ioapic_wr 3 0xff00000000000300
I tried with minicom and IRQ 4, and I can confirm that NMI is delivered only when using the redirection entry above. The one 0x0000000000000400 does nothing in my case. I tested on AMD Hudson chipset, I had to add #include <stdint.h> to have the proper type defines under Linux.
The NMI arrived to all CPUs, Linux said:
Oct 10 20:51:08 ruik kernel: [ 891.596625] Uhhuh. NMI received for unknown reason 31 on CPU 2. Oct 10 20:51:08 ruik kernel: [ 891.596631] Uhhuh. NMI received for unknown reason 31 on CPU 1. Oct 10 20:51:08 ruik kernel: [ 891.596633] Uhhuh. NMI received for unknown reason 31 on CPU 0. Oct 10 20:51:08 ruik kernel: [ 891.596636] Uhhuh. NMI received for unknown reason 31 on CPU 3. Oct 10 20:51:08 ruik kernel: [ 891.596638] Do you have a strange power saving mode enabled? Oct 10 20:51:08 ruik kernel: [ 891.596639] Do you have a strange power saving mode enabled? Oct 10 20:51:08 ruik kernel: [ 891.596644] Do you have a strange power saving mode enabled? Oct 10 20:51:08 ruik kernel: [ 891.596646] Dazed and confused, but trying to continue Oct 10 20:51:08 ruik kernel: [ 891.596648] Dazed and confused, but trying to continue Oct 10 20:51:08 ruik kernel: [ 891.596652] Dazed and confused, but trying to continue Oct 10 20:51:08 ruik kernel: [ 891.596666] Do you have a strange power saving mode enabled? Oct 10 20:51:08 ruik kernel: [ 891.596670] Dazed and confused, but trying to continue
I can try to contact Mr. AMD ask them to at least publish new errata versions.
Thanks Rudolf
On 10/10/2016 21:59, Rudolf Marek wrote:
Hi Andriy,
An example: $ ioapic_wr 3 0xff00000000000300
I tried with minicom and IRQ 4, and I can confirm that NMI is delivered only when using the redirection entry above. The one 0x0000000000000400 does nothing in my case. I tested on AMD Hudson chipset,
It would be interesting to test 0x00...300 and 0xff...400 just for completeness.
I had to add #include <stdint.h> to have the proper type defines under Linux.
I'll also add this one just in case.
The NMI arrived to all CPUs, Linux said:
Oct 10 20:51:08 ruik kernel: [ 891.596625] Uhhuh. NMI received for unknown reason 31 on CPU 2. Oct 10 20:51:08 ruik kernel: [ 891.596631] Uhhuh. NMI received for unknown reason 31 on CPU 1. Oct 10 20:51:08 ruik kernel: [ 891.596633] Uhhuh. NMI received for unknown reason 31 on CPU 0. Oct 10 20:51:08 ruik kernel: [ 891.596636] Uhhuh. NMI received for unknown reason 31 on CPU 3. Oct 10 20:51:08 ruik kernel: [ 891.596638] Do you have a strange power saving mode enabled? Oct 10 20:51:08 ruik kernel: [ 891.596639] Do you have a strange power saving mode enabled? Oct 10 20:51:08 ruik kernel: [ 891.596644] Do you have a strange power saving mode enabled? Oct 10 20:51:08 ruik kernel: [ 891.596646] Dazed and confused, but trying to continue Oct 10 20:51:08 ruik kernel: [ 891.596648] Dazed and confused, but trying to continue Oct 10 20:51:08 ruik kernel: [ 891.596652] Dazed and confused, but trying to continue Oct 10 20:51:08 ruik kernel: [ 891.596666] Do you have a strange power saving mode enabled? Oct 10 20:51:08 ruik kernel: [ 891.596670] Dazed and confused, but trying to continue
I can try to contact Mr. AMD ask them to at least publish new errata versions.
That would be great. I am really curious about the official clarification on the issue. Maybe there is a configuration bit that they forgot to set or something like that.
I found a really old document about AMD-8131 chipset which seems like something that later morphed into the APIC component of the southbrdiges and in that document they discuss APIC -> HT mapping quite extensively. I wonder what went wrong later on. In this copy of the document it's on page 67 http://www.tautec-electronics.de/Datenblaetter/Schaltkreise/AMD8131.pdf
Hi again,
It would be interesting to test 0x00...300 and 0xff...400 just for completeness.
The 0x00...300 does the same (NMI delivered on all CPUs) and the other one does nothing.
That would be great. I am really curious about the official clarification on the issue. Maybe there is a configuration bit that they forgot to set or something like that.
I try to get someone responsible for this.
I found a really old document about AMD-8131 chipset which seems like something that later morphed into the APIC component of the southbrdiges and in that document they discuss APIC -> HT mapping quite extensively. I wonder what went wrong later on. In this copy of the document it's on page 67 http://www.tautec-electronics.de/Datenblaetter/Schaltkreise/AMD8131.pdf
I think the SB600 series was an ATI design, so perhaps it is just bug.
Thanks Rudolf
There was a bit of incorrect information in my previous email.
I performed tests on a similar system but with SB850 southbridge: https://www.asus.com/ae-en/Motherboards/M4A89GTD_PRO/specifications/ This system has a Phenom II X4 955 processor installed. BTW, the system that I tested earlier has an Athlon II X2 250 processor. And then I retested the original system.
So, I was wrong that there was no PIC -> LINT0 connection. On both systems that connection exists and work perfectly well. To be clear, my tests show that PIC is connected to both LINT0 and I/O APIC pin 0. As I've just written, programming LINT0 LVT works as expected.
But routing interrupts via I/O APIC pin 0 has exactly the same problems as the other I/O interrupts.
When the delivery mode is set to ExtINT the interrupts seem to be delivered just if they were normal (fixed) vectored interrupts. So, e.g., setting vector bits to zero produces RcvdIllegalVector on a target core, but setting a valid vector results a in call to it.
When the delivery mode is set to NMI, on the SB700+Athlon system I get an immediate reset. On the SB850+Phenom system it seems that only the target core hangs. Other cores keep working and they report SendAcceptError when trying to IPI the affected core.
There is one quirk that caused my confusion during the original tests. On the SB700+Athlon system clearing LintEn bit results in LINT0 ignoring PIC interrupts. On the SB850+Phenom that bit does not affect PIC -> LINT0 route. The latter seems correct, the former looks like a bug.
So, as I wrote in the previous message, I can get things to work. But I am still curious if it's possible to get NMI delivery mode work for I/O APIC interrupts. It's clear that the southbridges do send something. But either those message are wrong, or the processors are not properly accepting them. I wonder if anyone got those working.
On 07/10/2016 17:59, Andriy Gapon wrote:
I apologize that my questions are not about coreboot and I am sort of hijacking the forum. But I think that this is a place where people with good knowledge of the hardware can be found. Maybe my issue will be of interest to you as well. And I hope to get some help. Maybe even some secrets shared :-)
First, the hardware that I am talking about: it's a typical consumer system with a Family 10h AMD processor, SB700 southbridge and 780G northbridge: http://www.gigabyte.com/products/product-page.aspx?pid=3004#sp
What I want to achieve is to get an interrupt generated by a SuperIO chip (external to the chipset) delivered to a CPU as a NMI. The interrupt is IRQ3 connected to pin 3 of the IO-APIC and everything just works if I forget about NMI: device -> IO-APIC pin 3 -> (fixed mode interrupt message) -> Local APIC -> interrupt handler at the configured vector. NMI is what makes it interesting.
So, the first thing I tried is simply to set the NMI delivery mode for the pin. Unfortunately, that does not work, the system gets reset as soon as the interrupt is generated. So, my first question is: can that be made to work at all? Perhaps, there are some registers that need to be correctly programmed in the chipset or in the processor for that to work. Or maybe it can not work at all. For example, for Intel ICH9R southbridge it is documented that SMI, NMI and INIT must not be used. I couldn't find any such restriction explicitly stated for the AMD chipsets.
So, I decided to not give up and to try to use the legacy interrupt mode to get what I want. I think that that's how Linux NMI watchdog driver used to work. So, I programmed LINT0 and LINT1 for NMI delivery mode (on all cores, all two of them), enabled legacy PIC (I guess that it's built into the chipset) and made sure that the interrupt is unmasked. But absolutely nothing happened when the interrupt is generated. From this I concluded that the PIC is not connected to the CPUs.
Just to be sure that I didn't make any mistake with the PIC programming I decided to check that the only other possibility worked, that is, that the PIC is connected to pin 0 of the IO-APIC. By default the pin was masked (by the OS, I guess), so I programmed it to ExtINT delivery mode with the BSP as the physical destination (also edge-triggered, active high). What I observed next was a bit surprising. Every time I generated the interrupt the target CPU would set bit 6 (received illegal vector) in its Local APIC's error status register. I concluded that the interrupt got routed from the PIC to the pin 0 of the IO-APIC, but then there was a problem delivering the ExtINT message.
I looked for mentions of ExtINT in the Family 10 BKDG and stumbled upon the LintEn bit in F0x68 register. The bit is described as such:
LintEn: local interrupt conversion enable. Read-write. 1=Enables the conversion of broadcast ExtInt and NMI interrupt requests to LINT0 and LINT1 local interrupts, respectively, before deliver- ing to the local APIC. This conversion only takes place if the local APIC is hardware enabled. LINT0 and LINT1 are controlled by APIC350 and APIC360. 0=ExtInt/NMI interrupts delivered unchanged.
The bit was unset. I decided to set it and see what happens. Much to my surprise I got NMIs delivered to both cores. Then I remembered that I still had NMI delivery mode set for LINT0 on both of them. And this happened despite the destination being programmed to zero (the BSP's APIC ID), not the broadcast address. Another weird thing was that the 'received illegal vector' bit was still getting set.
So, I got what I wanted but in a complex configuration: device -> PIC -> IO-APIC pin 0 -> (ExtINT) -> (converted to LINT0) -> NMI
I decided to simplify that scheme a little bit to this: device -> IO-APIC pin 3 -> (ExtINT) -> (converted to LINT0) -> NMI so that I didn't have to enable the PIC. That is, I reprogrammed the pin 3 to ExtINT delivery mode, masked pin 0 and disabled PIC. But that didn't work. Regardless of how LintEn was set I was getting just the bad vector errors without NMI.
Then I decided to incrementally go back to working configuration. I enabled PIC - still no NMI. I unmasked IRQ3 - got NMI! Pin 0 was still masked. So, this working configuration was slightly different from the original working configuration: device -> PIC [IRQ3 unmasked] -> IO-APIC pin 0 -> (Masked) -> IO-APIC pin 3 -> (ExtINT) -> (converted to LINT0) -> NMI So, the interrupt was going through the APIC route, but the legacy stuff had to be enabled for ExtINT to be converted to LINT0.
I think that this is an interesting discovery: PIC's configuration affects how the IO-APIC communicates to the Local APICs.
I went over the BKDG and SB7x0 documentation (RRG, RPG) and only PCI_Reg 62h of device 20 function 0 in SB7x0 caught my eye. Namely, K8_INTR, MT3_Set and MT3_Auto bits. They all are about K8 INTR [NMI] message. On my system the register is set to 0x24, that is K8_INTR is set, but MT3_Set and MT3_Auto are not... In the coreboot source code I see that the register is set up with exactly the same value in src/southbridge/amd/sb700/early_setup.c. And, for what it's worth, bit 5 (0x20 mask) is documented as "reserved".
I would greatly appreciate if anyone could tell me how to configure the system to get NMIs working in the simplest possible fashion. That is, without enabling PIC. And preferably without requiring the message to LINTn conversion with LintEn.
I will be happy with just magic setting. But if there will be an explanation of what the settings do and how interrupt routing works within the chipset and between the chipset and the CPU, then that would be terrific!
I hope that this was not a boring read. Thank you!