I don't think our IO-APIC is enabled.. We tell it to set its ID to 2, but it continues to think its ID is F. Here is the kernel panic:
ENABLING IO-APIC IRQs. ...changing IO-APIC physical APIC ID to 2 ... Kernel panic: could not set ID!. . In idle task - not syncing. APIC error interrupt on CPU#0, should never happen.. ... APIC ESR0: 00000000. ... APIC ESR1: 00000040. ... bit 6: Received Illegal Vector
Does anyone know how to setup the IO-APIC correctly and/or know where I can find this information (Maciej, you wrote io_apic.c, right?)? Any help would be greatly appreciated. Thanks.
- James
Here is the full boot so far:
Linux version 2.4.0-test1 (jimi@snaresland) (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #7 SMP Mon Jun 26 16:46:02 MDT 2000. BIOS-provided physical RAM map:. e820: 000000000009f000 @ 0000000000000000 (usable). e820: 0000000003e00000 @ 0000000000100000 (usable). Scan SMP from c0000000 for 1024 bytes.. found SMP MP-table at 00000000. hm, page 00000000 reserved twice.. On node 0 totalpages: 16128. zone(0): 4096 pages.. zone(1): 12032 pages.. zone(2): 0 pages.. Intel MultiProcessor Specification v1.4. Virtual Wire compatibility mode.. Default MP configuration #5. Processor #0 Pentium(tm) Pro APIC version 16. Floating point unit present.. Machine Exception supported.. 64 bit compare & exchange supported.. Internal APIC present.. Processor #1 Pentium(tm) Pro APIC version 16. Floating point unit present.. Machine Exception supported.. 64 bit compare & exchange supported.. Internal APIC present.. Bus #0 is ISA . Bus #1 is PCI . I/O APIC #2 Version 16 at 0xFEC00000.. Int: type 0, pol 0, trig 0, bus 0, IRQ 00, APIC ID 2, APIC INT 02. Int: type 0, pol 0, trig 0, bus 0, IRQ 01, APIC ID 2, APIC INT 01. Int: type 0, pol 0, trig 0, bus 0, IRQ 03, APIC ID 2, APIC INT 03. Int: type 0, pol 0, trig 0, bus 0, IRQ 04, APIC ID 2, APIC INT 04. Int: type 0, pol 0, trig 0, bus 0, IRQ 05, APIC ID 2, APIC INT 05. Int: type 0, pol 0, trig 0, bus 0, IRQ 06, APIC ID 2, APIC INT 06. Int: type 0, pol 0, trig 0, bus 0, IRQ 07, APIC ID 2, APIC INT 07. Int: type 0, pol 0, trig 0, bus 0, IRQ 08, APIC ID 2, APIC INT 08. Int: type 0, pol 0, trig 0, bus 0, IRQ 09, APIC ID 2, APIC INT 09. Int: type 0, pol 0, trig 0, bus 0, IRQ 0a, APIC ID 2, APIC INT 0a. Int: type 0, pol 0, trig 0, bus 0, IRQ 0b, APIC ID 2, APIC INT 0b. Int: type 0, pol 0, trig 0, bus 0, IRQ 0c, APIC ID 2, APIC INT 0c. Int: type 0, pol 0, trig 0, bus 0, IRQ 0d, APIC ID 2, APIC INT 0d. Int: type 0, pol 0, trig 0, bus 0, IRQ 0e, APIC ID 2, APIC INT 0e. Int: type 0, pol 0, trig 0, bus 0, IRQ 0f, APIC ID 2, APIC INT 0f. Int: type 3, pol 0, trig 0, bus 0, IRQ 00, APIC ID 2, APIC INT 00. Lint: type 3, pol 0, trig 0, bus 0, IRQ 00, APIC ID ff, APIC LINT 00. Lint: type 1, pol 0, trig 0, bus 0, IRQ 00, APIC ID ff, APIC LINT 01. Processors: 2. mapped APIC to ffffe000 (fee00000). James: boot_cpu_id being set to 2016 from APIC_ID. mapped IOAPIC to ffffd000 (fec00000). Enabling extended fast FPU save and restore ... done.. Enabling KNI unmasked exception support ... done.. Kernel command line: root=/dev/hda1 ide0=reset. Initializing CPU#0. Detected 497445265 Hz processor.. Calibrating delay loop... 992.87 BogoMIPS. Memory: 61980k/64512k available (585k kernel code, 2144k reserved, 69k data, 156k init, 0k highmem). Dentry-cache hash table entries: 8192 (order: 4, 65536 bytes). Buffer-cache hash table entries: 1024 (order: 0, 4096 bytes). Page-cache hash table entries: 16384 (order: 4, 65536 bytes). kmem_create: Poisoning requested, but con given - bdev_cache. Inode-cache hash table entries: 4096 (order: 3, 32768 bytes). kmem_create: Poisoning requested, but con given - inode_cache. CPU serial number disabled.. Checking 386/387 coupling... OK, FPU using exception 16 error reporting.. Checking 'hlt' instruction... OK.. POSIX conformance testing by UNIFIX. mtrr: v1.36 (20000221) Richard Gooch (rgooch@atnf.csiro.au). CPU serial number disabled.. CPU0: Intel Pentium III (Katmai) stepping 03. per-CPU timeslice cutoff: 0.00 usecs.. Getting VERSION: 40011. Getting VERSION: 40011. Getting LVT0: 700. Getting LVT1: 400. enabled ExtINT on CPU#0. ESR value before enabling vector: 00000000. ESR value after enabling vector: 00000000. CPU present map: 3. Booting processor 1/0 eip 2000. Setting warm reset code and vector.. 1.. 2.. 3.. Asserting INIT.. Waiting for send to finish.... +Deasserting INIT.. Waiting for send to finish.... +#startup loops: 2.. Sending STARTUP #1.. After apic_write.. Startup point 1.. Waiting for send to finish.... +Initializing CPU#1. Sending STARTUP #2.. After apic_write.. Startup point 1.. CPU#1 (phys ID: 0) waiting for CALLOUT. Waiting for send to finish.... +After Startup.. Before Callout 1.. After Callout 1.. CALLIN, before setup_local_APIC().. masked ExtINT on CPU#1. ESR value before enabling vector: 00000000. ESR value after enabling vector: 00000000. Calibrating delay loop... 992.87 BogoMIPS. Stack at about c116dfbc. CPU serial number disabled.. OK.. CPU1: Intel Pentium III (Katmai) stepping 03. CPU has booted.. Before bogomips.. Total of 2 processors activated (1985.74 BogoMIPS).. Before bogocount - setting activated=1.. Boot done.. ENABLING IO-APIC IRQs. ...changing IO-APIC physical APIC ID to 2 ... Kernel panic: could not set ID!. . In idle task - not syncing. APIC error interrupt on CPU#0, should never happen.. ... APIC ESR0: 00000000. ... APIC ESR1: 00000040. ... bit 6: Received Illegal Vector
- To unsubscribe: send mail to majordomo@freiburg.linux.de with 'unsubscribe openbios' in the body of the message
On Tue, 27 Jun 2000, James Hendricks wrote:
I don't think our IO-APIC is enabled.. We tell it to set its ID to 2, but it continues to think its ID is F. Here is the kernel panic:
It looks like chip select for the APIC is not enabled or is configured for another address.
Does anyone know how to setup the IO-APIC correctly and/or know where I can find this information (Maciej, you wrote io_apic.c, right?)?
Hmm, I guess that's Ingo Molnar who deserves credit for writing io_apic.c.
Any help would be greatly appreciated. Thanks.
Make sure the I/O APIC is mapped at the address reported in the MP table. You should get necessary bits from chipset's datasheets. There is print_IO_APIC() function you may use for diagnostics, too. You may check you are looking at a real APIC with it.
It looks like chip select for the APIC is not enabled or is configured for another address.
Exactly. Now we have another interesting problem. Once we set the APIC chip select enable, the second processor starts throwing a GPF at set_mtrr_state in arch/i386/kernel/mtrr.c. What can you do that will throw a GPF? I noticed the comment at the top of the function:
/* [SUMMARY] Set the MTRR state for this CPU. <state> The MTRR state information to read. <ctxt> Some relevant CPU context. [NOTE] The CPU must already be in a safe state for MTRR changes. [RETURNS] 0 if no changes made, else a mask indication what was changed. */
How do we get into a "safe state" for mtrr changes?
We read the MTR just fine, but when we write back to it the whole thing just plain dies..
Thanks. - James
Here is the trace:
Linux version 2.4.0-test1 (jimi@snaresland) (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #17 SMP Tue Jun 27 19:49:22 MDT 2000. BIOS-provided physical RAM map:. e820: 000000000009f000 @ 0000000000000000 (usable). e820: 0000000003e00000 @ 0000000000100000 (usable). Scan SMP from c0000000 for 1024 bytes.. found SMP MP-table at 00000000. hm, page 00000000 reserved twice.. On node 0 totalpages: 16128. zone(0): 4096 pages.. zone(1): 12032 pages.. zone(2): 0 pages.. Intel MultiProcessor Specification v1.4. Virtual Wire compatibility mode.. Default MP configuration #5. Processor #0 Pentium(tm) Pro APIC version 16. Floating point unit present.. Machine Exception supported.. 64 bit compare & exchange supported.. Internal APIC present.. Processor #1 Pentium(tm) Pro APIC version 16. Floating point unit present.. Machine Exception supported.. 64 bit compare & exchange supported.. Internal APIC present.. Bus #0 is ISA . Bus #1 is PCI . I/O APIC #2 Version 16 at 0xFEC00000.. Int: type 0, pol 0, trig 0, bus 0, IRQ 00, APIC ID 2, APIC INT 02. Int: type 0, pol 0, trig 0, bus 0, IRQ 01, APIC ID 2, APIC INT 01. Int: type 0, pol 0, trig 0, bus 0, IRQ 03, APIC ID 2, APIC INT 03. Int: type 0, pol 0, trig 0, bus 0, IRQ 04, APIC ID 2, APIC INT 04. Int: type 0, pol 0, trig 0, bus 0, IRQ 05, APIC ID 2, APIC INT 05. Int: type 0, pol 0, trig 0, bus 0, IRQ 06, APIC ID 2, APIC INT 06. Int: type 0, pol 0, trig 0, bus 0, IRQ 07, APIC ID 2, APIC INT 07. Int: type 0, pol 0, trig 0, bus 0, IRQ 08, APIC ID 2, APIC INT 08. Int: type 0, pol 0, trig 0, bus 0, IRQ 09, APIC ID 2, APIC INT 09. Int: type 0, pol 0, trig 0, bus 0, IRQ 0a, APIC ID 2, APIC INT 0a. Int: type 0, pol 0, trig 0, bus 0, IRQ 0b, APIC ID 2, APIC INT 0b. Int: type 0, pol 0, trig 0, bus 0, IRQ 0c, APIC ID 2, APIC INT 0c. Int: type 0, pol 0, trig 0, bus 0, IRQ 0d, APIC ID 2, APIC INT 0d. Int: type 0, pol 0, trig 0, bus 0, IRQ 0e, APIC ID 2, APIC INT 0e. Int: type 0, pol 0, trig 0, bus 0, IRQ 0f, APIC ID 2, APIC INT 0f. Int: type 3, pol 0, trig 0, bus 0, IRQ 00, APIC ID 2, APIC INT 00. Lint: type 3, pol 0, trig 0, bus 0, IRQ 00, APIC ID ff, APIC LINT 00. Lint: type 1, pol 0, trig 0, bus 0, IRQ 00, APIC ID ff, APIC LINT 01. Processors: 2. mapped APIC to ffffe000 (fee00000). James: boot_cpu_id being set to 2016 from APIC_ID. mapped IOAPIC to ffffd000 (fec00000). Enabling extended fast FPU save and restore ... done.. Enabling KNI unmasked exception support ... done.. Kernel command line: root=/dev/hda1 ide0=reset. Initializing CPU#0. Detected 497438524 Hz processor.. unexpected IRQ trap at vector 31. Calibrating delay loop... unexpected IRQ trap at vector 31. unexpected IRQ trap at vector 31. 992.87 BogoMIPS. Memory: 61980k/64512k available (586k kernel code, 2144k reserved, 69k data, 156k init, 0k highmem). Dentry-cache hash table entries: 8192 (order: 4, 65536 bytes). Buffer-cache hash table entries: 1024 (order: 0, 4096 bytes). Page-cache hash table entries: 16384 (order: 4, 65536 bytes). kmem_create: Poisoning requested, but con given - bdev_cache. Inode-cache hash table entries: 4096 (order: 3, 32768 bytes). kmem_create: Poisoning requested, but con given - inode_cache. CPU serial number disabled.. James: x86_vendor = 0. James: edx = 64. Checking 386/387 coupling... OK, FPU using exception 16 error reporting.. Checking 'hlt' instruction... unexpected IRQ trap at vector 31. OK.. POSIX conformance testing by UNIFIX. mtrr: v1.36 (20000221) Richard Gooch (rgooch@atnf.csiro.au). CPU serial number disabled.. James: x86_vendor = 0. James: edx = 64. CPU0: Intel Pentium III (Katmai) stepping 03. James: cpu_hz 497438524, cachesize 32, bandwidth 350, cacheflush_time 44377. per-CPU timeslice cutoff: 89.28 usecs.. Getting VERSION: 40011. Getting VERSION: 40011. Getting LVT0: 700. Getting LVT1: 400. enabled ExtINT on CPU#0. ESR value before enabling vector: 00000000. ESR value after enabling vector: 00000000. CPU present map: 3. James: apicid 0, boot_cpu_id 1. Booting processor 1/0 eip 2000. Setting warm reset code and vector.. 1.. 2.. 3.. Asserting INIT.. Waiting for send to finish.... +Deasserting INIT.. Waiting for send to finish.... +#startup loops: 2.. Sending STARTUP #1.. After apic_write.. Startup point 1.. Waiting for send to finish.... +Initializing CPU#1. Sending STARTUP #2.. After apic_write.. Startup point 1.. CPU#1 (phys ID: 0) waiting for CALLOUT. Waiting for send to finish.... +After Startup.. Before Callout 1.. After Callout 1.. CALLIN, before setup_local_APIC().. masked ExtINT on CPU#1. ESR value before enabling vector: 00000000. ESR value after enabling vector: 00000000. general protection fault: 0000. CPU: 1. EIP: 0010:[<c01a6dd1>]. EFLAGS: 00010087. eax: 06050207 ebx: 07070707 ecx: 00000250 edx: 07070703. esi: 00000002 edi: c01b1420 ebp: c01b1428 esp: c116df48. ds: 0018 es: 0018 ss: 0018. Process swapper (pid: 0, stackpage=c116d000). Stack: 00000008 00000002 c01b1420 c116df98 00000000 00000000 00000021 00000000 . c01a6f4f c01b1428 c116df98 00000000 00000000 00000000 c01a73c2 c01b1420 . c116df98 c116df98 00000000 00000000 00000286 00000000 00000000 00000690 . Call Trace: [<c0176460>] [<c016acb7>] [<c016acb7>] . Code: 0f 30 c7 44 24 1c 01 00 unexpected IRQ trap at vector 31. 00 00 c7 44 24 14 00 00 00 00 8d 7d . unexpected IRQ trap at vector 31. Kernel panic: Attempted to kill the idle task!. In idle task - not syncing. APIC error interrupt on CPU#1, should never happen.. ... APIC ESR0: 00000000. ... APIC ESR1: 00000040. ... bit 6: Received Illegal Vector.. unexpected IRQ trap at vector 31. unexpected IRQ trap at vector 31. unexpected IRQ trap at vector 31. unexpected IRQ trap at vector 31. unexpected IRQ trap at vector 31. unexpected IRQ trap at vector 31. unexpected IRQ trap at vector 31.
- To unsubscribe: send mail to majordomo@freiburg.linux.de with 'unsubscribe openbios' in the body of the message
actually, james made a typo: it's the MSR, not the MTRR. It's the part of the kernel that sets the range MSRs.
What we know: the rdmsr works. The MSR # is valid. The contents of the registers for the MSR look good; theses contents have already been used by cpu #0 to set its own range MSRs. The processor is running with CS=10, and the DPL for the segment is 0.
Finally, if we skip the first WRMSR, the second will fail. If we skip the second, the third will fail. And so on. I *think* it's the wrmsr itself, rather than the values we're passing in; It's almost as though we're running with the wrong processor priveleges.
Other data: cr0 and cr4 differ. On the failing CPU, Cr4 has 0x618, which means DE is set. On the working CPU, cr4 is 0x690. The working CPU has CD (cache disable) cleared on the working CPU, and CD is set on the cpu that fails. BUT, if we change these CRs to match the working CPU, the WRMSR still fails.
In other words, there is no reason we can see for the WRMSR to fail. You can get a GPF(0) for this, and we get it, but none of the conditions documented by the manual explain the GPF(0).
Any thoughts?
ron
- To unsubscribe: send mail to majordomo@freiburg.linux.de with 'unsubscribe openbios' in the body of the message
On Thu, 29 Jun 2000, James V Hendricks wrote:
general protection fault: 0000. CPU: 1. EIP: 0010:[<c01a6dd1>]. EFLAGS: 00010087. eax: 06050207 ebx: 07070707 ecx: 00000250 edx: 07070703.
^^^^^^^^ ^^^^^^^^
esi: 00000002 edi: c01b1420 ebp: c01b1428 esp: c116df48. ds: 0018 es: 0018 ss: 0018. Process swapper (pid: 0, stackpage=c116d000). Stack: 00000008 00000002 c01b1420 c116df98 00000000 00000000 00000021 00000000 . c01a6f4f c01b1428 c116df98 00000000 00000000 00000000 c01a73c2 c01b1420 . c116df98 c116df98 00000000 00000000 00000286 00000000 00000000 00000690 . Call Trace: [<c0176460>] [<c016acb7>] [<c016acb7>] . Code: 0f 30 c7 44 24 1c 01 00 unexpected IRQ trap at vector 31. 00 00 c7 44 24 14 00 00 00 00 8d 7d .
You are writing bogus data to an MTRR, hence a GPF.