My kernel crash issue is attached below. This one isn't an Exception 6 (that I can tell), but it's one I've seen before. This happened twice in a row, after the machine had been off for a while. As usual, flipping to the factory BIOS, seeing the "corrupt CMOS" message, and re-writing the CMOS fixed the issue. I immediately flipped back to LB, and it worked as expected.
I looked at the CMOS code in src/pc80/mc146818rtc.c. The code that prints "Invalid CMOS LB checksum" is found immediately after a call to CMOS_WRITE, then it sets the checksum. If you remove the comments in the source, the following two code snippets are contiguous, but I've inserted comments.
The following works ok: === checksum_invalid = !rtc_checksum_valid(PC_CKS_RANGE_START,PC_CKS_RANGE_END,PC_CKS_LOC);
if (invalid || cmos_invalid || checksum_invalid) { printk_warning("RTC:%s%s%s zeroing cmos\n", invalid?" Clear requested":"", cmos_invalid?" Power Problem":"", checksum_invalid?" Checksum invalid":""); ===
The following fails. You can see at the end of this block, there is a routine to set the checksum. Here's my guess: the CMOS_WRITE commands below do not work correctly, causing the checksum error. A valid checksum is computed and written, which prevents the error from occurring above on the next reboot. OR, 'Invalid CMOS LB checksum' is perfectly normal because the checksum should be invalid after a write. In the case of the latter, the checksumming that occurs below is pointless, since it's about to be written anyway. ===
/* Setup the real time clock */ CMOS_WRITE(RTC_CONTROL_DEFAULT, RTC_CONTROL); /* Setup the frequency it operates at */ CMOS_WRITE(RTC_FREQ_SELECT_DEFAULT, RTC_FREQ_SELECT);
/* See if there is a LB CMOS checksum error */ checksum_invalid = !rtc_checksum_valid(LB_CKS_RANGE_START,LB_CKS_RANGE_END,LB_CKS_LOC); if(checksum_invalid) printk_debug("Invalid CMOS LB checksum\n");
/* Make certain we have a valid checksum */ rtc_set_checksum(PC_CKS_RANGE_START,PC_CKS_RANGE_END,PC_CKS_LOC);
===
-- Eric
Ronald G Minnich wrote:
Eric Poulsen wrote:
- Does LB store anything in the CMOS?
it can. It does store a few abouts about the type of boot that occurred (fallback or normal).
- If yes, is there anything in there that could become corrupted and
cause "weird issues" as described above?
What's exception 6? I don't recall.
sometimes clocking info is stored in CMOS. There could be a collision here.
If you can tell more it would be good to know.
ron
0
LinuxBIOS-1.1.8.0Fallback Tue Apr 25 20:17:47 PDT 2006 starting... Enabling mainboard devices Enabling shadow ram vt8623 init starting Detecting Memory Number of Banks 04 Number of Rows 0d Priamry DRAM width08 No Columns 0a MA type e0 Bank 0 (*16 Mb) 10 No Physical Banks 01 Total Memory (*16 Mb) 10 CAS Supported 2 2.5 3 Cycle time at CL X (nS)50 Cycle time at CL X-0.5 (nS)60 Cycle time at CL X-1 (nS)75 Starting at CAS 3 We can do CAS 2.5 We can do CAS 2 tRP 48 tRCD 48 tRAS 28 Low Bond 00 High Bondc7 Setting DQS delay84vt8623 done 00:06 11 23 31 06 00 30 22 00 00 00 06 00 00 00 00 10:08 00 00 d0 00 00 00 00 00 00 00 00 00 00 00 00 20:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30:00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 40:00 18 88 80 82 44 00 00 18 99 88 80 82 44 00 00 50:c8 de cf 88 e0 07 00 00 e0 00 10 10 10 10 00 00 60:02 ff 00 30 d6 32 01 20 42 2d 43 58 00 44 00 00 70:82 48 00 01 01 08 50 00 01 00 00 00 00 00 00 02 80:0f 65 00 00 80 00 00 00 02 00 00 00 00 00 00 00 90:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0:02 c0 20 00 07 02 00 1f 04 00 00 00 2f 02 04 00 b0:00 00 00 00 40 00 00 00 a8 00 00 00 00 00 00 00 c0:01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 d0:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0:00 dd 00 00 00 00 01 00 40 00 00 00 00 00 00 00 f0:00 00 00 00 00 00 12 13 00 00 00 00 00 00 00 00 AGP Doing MTRR init. Copying LinuxBIOS to ram. Jumping to LinuxBIOS. LinuxBIOS-1.1.8.0Fallback Tue Apr 25 20:17:47 PDT 2006 booting... clocks_per_usec: 838 Enumerating buses... Finding PCI configuration type. PCI: Using configuration type 1 PCI_DOMAIN: 0000 enabled APIC_CLUSTER: 0 enabled PCI: pci_scan_bus for bus 0 PCI: 00:00.0 [1106/3123] enabled PCI: 00:01.0 [1106/b091] enabled Disabling static device: PCI: 00:0a.0 Disabling static device: PCI: 00:0a.1 In vt8235_enable 1106 3038. PCI: 00:10.0 [1106/3038] enabled In vt8235_enable 1106 3038. PCI: 00:10.1 [1106/3038] enabled In vt8235_enable 1106 3038. PCI: 00:10.2 [1106/3038] enabled In vt8235_enable 1106 3104. PCI: 00:10.3 [1106/3104] enabled In vt8235_enable 1106 3177. Initialising Devices PCI: 00:11.0 [1106/3177] enabled In vt8235_enable 1106 0571. PCI: 00:11.1 [1106/0571] enabled In vt8235_enable 1106 3059. PCI: 00:11.5 [1106/3059] enabled In vt8235_enable ffff ffff. In vt8235_enable 1106 3065. PCI: 00:12.0 [1106/3065] enabled PCI: pci_scan_bus for bus 1 PCI: 01:00.0 [1106/3122] enabled PCI: pci_scan_bus returning with max=01 vt1211 enabling PNP devices. PNP: 002e.0 enabled vt1211 enabling PNP devices. PNP: 002e.1 enabled vt1211 enabling PNP devices. PNP: 002e.2 enabled vt1211 enabling PNP devices. PNP: 002e.3 enabled vt1211 enabling PNP devices. PNP: 002e.b enabled PCI: pci_scan_bus returning with max=01 done Allocating resources... Reading resources... Done reading resources. Setting resources... I would set ram size to 0x40000 Kbytes PCI: 00:10.0 20 <- [0x0000001800 - 0x000000181f] io PCI: 00:10.1 20 <- [0x0000001820 - 0x000000183f] io PCI: 00:10.2 20 <- [0x0000001840 - 0x000000185f] io PCI: 00:10.3 10 <- [0x00febff000 - 0x00febff0ff] mem PNP: 002e.0 60 <- [0x00000003f0 - 0x00000003f7] io PNP: 002e.0 70 <- [0x0000000006 - 0x0000000006] irq PNP: 002e.0 74 <- [0x0000000002 - 0x0000000002] drq PNP: 002e.1 60 <- [0x0000000378 - 0x000000037f] io PNP: 002e.1 70 <- [0x0000000007 - 0x0000000007] irq PNP: 002e.1 74 <- [0x0000000003 - 0x0000000003] drq PNP: 002e.2 60 <- [0x00000003f8 - 0x00000003ff] io PNP: 002e.2 70 <- [0x0000000004 - 0x0000000004] irq PNP: 002e.3 60 <- [0x00000002f8 - 0x00000002ff] io PNP: 002e.3 70 <- [0x0000000003 - 0x0000000003] irq PNP: 002e.b 60 <- [0x000000ec00 - 0x000000ecff] io PCI: 00:11.1 20 <- [0x0000001860 - 0x000000186f] io PCI: 00:11.5 10 <- [0x0000001000 - 0x00000010ff] io PCI: 00:12.0 10 <- [0x0000001400 - 0x00000014ff] io PCI: 00:12.0 14 <- [0x00fec00000 - 0x00fec000ff] mem Done setting resources. Done allocating resources. Enabling resources... PCI: 00:00.0 cmd <- 146 PCI: 00:01.0 bridge ctrl <- 000f PCI: 00:01.0 cmd <- 147 PCI: 01:00.0 cmd <- 140 PCI: 00:10.0 subsystem <- 00/00 PCI: 00:10.0 cmd <- 141 PCI: 00:10.1 subsystem <- 00/00 PCI: 00:10.1 cmd <- 141 PCI: 00:10.2 subsystem <- 00/00 PCI: 00:10.2 cmd <- 141 PCI: 00:10.3 subsystem <- 00/00 PCI: 00:10.3 cmd <- 142 PCI: 00:11.0 cmd <- 147 PNP: 002e.0 - enabling PNP: 002e.1 - enabling PNP: 002e.2 - enabling PNP: 002e.3 - enabling PNP: 002e.b - enabling PCI: 00:11.1 cmd <- 147 PCI: 00:11.5 subsystem <- 00/00 PCI: 00:11.5 cmd <- 141 PCI: 00:12.0 cmd <- 1c3 done. Initializing devices... Root Device init PCI: 00:10.0 init PCI: 00:10.1 init PCI: 00:10.2 init PCI: 00:10.3 init PCI: 00:11.0 init vt8235 init RTC Init Invalid CMOS LB checksum pci_routing_fixup: dev is 00010fa0 setting firewire setting usb Assigning IRQ 5 to 0:10.0 Readback = 5 pci_level_irq: lower order bits are wrong: want 0x0, got 0x20 Assigning IRQ 9 to 0:10.1 Readback = 9 pci_level_irq: lower order bits are wrong: want 0x0, got 0x20 Assigning IRQ 9 to 0:10.2 Readback = 9 pci_level_irq: lower order bits are wrong: want 0x0, got 0x20 Assigning IRQ 5 to 0:10.3 Readback = 5 pci_level_irq: lower order bits are wrong: want 0x0, got 0x20 setting vt8235 Assigning IRQ 5 to 0:11.1 Readback = 5 pci_level_irq: lower order bits are wrong: want 0x0, got 0x20 Assigning IRQ 9 to 0:11.5 Readback = 9 pci_level_irq: lower order bits are wrong: want 0x0, got 0x20 setting ethernet Assigning IRQ 5 to 0:12.0 Readback = 5 pci_level_irq: lower order bits are wrong: want 0x0, got 0x20 setting vga Assigning IRQ 5 to 1:0.0 Readback = 5 pci_level_irq: lower order bits are wrong: want 0x0, got 0x20 setting pci slot setting cardbus slot setting riser slot PNP: 002e.0 init PNP: 002e.1 init PNP: 002e.2 init PNP: 002e.3 init PNP: 002e.b init PCI: 00:11.1 init Enabling VIA IDE. ide_init: enabling compatibility IDE addresses enables in reg 0x42 0x0 enables in reg 0x42 read back as 0x0 enables in reg 0x40 0x13 enables in reg 0x40 read back as 0x13 enables in reg 0x9 0x8a enables in reg 0x9 read back as 0x8a command in reg 0x4 0x7 command in reg 0x4 reads back as 0x7 PCI: 00:11.5 init PCI: 00:12.0 init Configuring VIA Rhine LAN APIC_CLUSTER: 0 init Initializing CPU #0 CPU: vendor Centaur device 673 Enabling cache
Setting fixed MTRRs(0-88) type: UC Setting fixed MTRRs(0-16) Type: WB Setting fixed MTRRs(24-88) Type: WB DONE fixed MTRRs Setting variable MTRR 0, base: 0MB, range: 128MB, type WB Setting variable MTRR 1, base: 128MB, range: 64MB, type WB Setting variable MTRR 2, base: 192MB, range: 32MB, type WB DONE variable MTRRs Clear out the extra MTRR's
MTRR check Fixed MTRRs : Enabled Variable MTRRs: Enabled
Disabling local apic...done. CPU #0 Initialized PCI: 00:00.0 init VT8623 random fixup ... Frame buffer at d0000000 PCI: 00:01.0 init VT8623 AGP random fixup ... PCI: 01:00.0 init VGA random fixup ... INSTALL REAL-MODE IDT DO THE VGA BIOS found VGA: vid=1106, did=3122 rom base, size: fffc0000 write_protect_vgabios bus/devfn = 0x100 biosint: INT# 0x15 biosint: eax 0x5f00 ebx 0x18538 ecx 0x17fa0 edx 0xa biosint: ebp 0x17f70 esp 0xff2 edi 0xecf0 esi 0x18538 biosint: ip 0x637f cs 0xc000 flags 0x46 biosint: INT# 0x1a biosint: eax 0xb108 ebx 0x10000 ecx 0x10000 edx 0x3d5 biosint: ebp 0x17f70 esp 0xfcc edi 0xf6 esi 0x155eb biosint: ip 0x40da cs 0xc000 flags 0x46 0xb108: bus 0 devfn 0x0 reg 0xf6 val 0x12 biosint: INT# 0x15 biosint: eax 0x5f0f ebx 0x18538 ecx 0x7fa0 edx 0x3d5 biosint: ebp 0x17f70 esp 0xfee edi 0x44 esi 0x18538 biosint: ip 0x647e cs 0xc000 flags 0x7 biosint: INT# 0x15 biosint: eax 0x5f02 ebx 0x18538 ecx 0x7f01 edx 0x3d5 biosint: ebp 0x17f70 esp 0xfdc edi 0x44 esi 0x18538 biosint: ip 0x63cb cs 0xc000 flags 0x46 biosint: INT# 0x15 biosint: eax 0x5f18 ebx 0x18501 ecx 0x7fa0 edx 0x3d5 biosint: ebp 0x17f70 esp 0xfde edi 0x44 esi 0x18538 biosint: ip 0x6496 cs 0xc000 flags 0x46 biosint: INT# 0x15 biosint: eax 0x5f06 ebx 0x18001 ecx 0x1 edx 0x0 biosint: ebp 0x10fd6 esp 0xfa4 edi 0x0 esi 0x14672 biosint: ip 0x63dc cs 0xc000 flags 0x246 biosint: INT# 0x15 biosint: eax 0x5f08 ebx 0x10d01 ecx 0x8301 edx 0xd4 biosint: ebp 0x10fd6 esp 0xfa4 edi 0x0 esi 0x10d0e biosint: ip 0x63e8 cs 0xc000 flags 0x246 Devices initialized Copying IRQ routing tables to 0xf0000...done. Verifing copy of IRQ routing tables at 0xf0000...done Checking IRQ routing table consistency... check_pirq_routing_table() - irq_routing_table located at: 0x000f0000 done. ACPI: Writing ACPI tables at f0400... ACPI: * FACS ACPI: * DSDT @ 000f049e Length 3f0 ACPI: * FADT ACPI: added table 1/5 Length now 40 ACPI: done. Moving GDT to 0x500...ok Wrote linuxbios table at: 00000530 - 00000b80 checksum 68b7
Welcome to elfboot, the open sourced starter. January 2002, Eric Biederman. Version 1.3
33:stream_init() - rom_stream: 0xfffd0000 - 0xfffeffff Found ELF candiate at offset 0 New segment addr 0x100000 size 0x23760 offset 0xc0 filesize 0x96e8 (cleaned up) New segment addr 0x100000 size 0x23760 offset 0xc0 filesize 0x96e8 New segment addr 0x123760 size 0x48 offset 0x97c0 filesize 0x48 (cleaned up) New segment addr 0x123760 size 0x48 offset 0x97c0 filesize 0x48 Dropping non PT_LOAD segment Dropping non PT_LOAD segment Loading Segment: addr: 0x0000000000100000 memsz: 0x0000000000023760 filesz: 0x00000000000096e8 Clearing Segment: addr: 0x00000000001096e8 memsz: 0x000000000001a078 Loading Segment: addr: 0x0000000000123760 memsz: 0x0000000000000048 filesz: 0x0000000000000048 Jumping to boot code at 0x107860 FILO version 0.4.2 (root@embedded) Tue Apr 25 20:15:07 PDT 2006 boot: hda1:/vmlinuz root=/dev/hda1 console=tty0 console=ttyS0,115200 hda: LBA 80GB: WDC WD800JB-00FMA0 Mounted ext2fs Found Linux version 2.6.16.5 (root@Proteus) #3 Sun Apr 16 21:06:34 PDT 2006 bzImage. Loading kernel... ok Jumping to entry point... [17179569.184000] Linux version 2.6.16.5 (root@Proteus) (gcc version 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu9)) #3 Sun Apr 16 21:06:34 PDT 2006 [17179569.184000] BIOS-provided physical RAM map: [17179569.184000] BIOS-e820: 0000000000000be0 - 00000000000a0000 (usable) [17179569.184000] BIOS-e820: 0000000000100000 - 000000000e000000 (usable) [17179569.184000] 0MB HIGHMEM available. [17179569.184000] 224MB LOWMEM available. [17179569.184000] DMI not present or invalid. [17179569.184000] ACPI: PM-Timer IO Port: 0x408 [17179569.184000] Allocating PCI resources starting at 10000000 (gap: 0e000000:f2000000) [17179569.184000] Built 1 zonelists [17179569.184000] Kernel command line: root=/dev/hda1 console=tty0 console=ttyS0,115200 [17179569.184000] No local APIC present or hardware disabled [17179569.184000] Initializing CPU#0 [17179569.184000] PID hash table entries: 1024 (order: 10, 16384 bytes) [17179569.184000] Detected 533.193 MHz processor. [17179569.184000] Using pmtmr for high-res timesource [17179569.184000] Console: colour VGA+ 80x25 [17179571.984000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) [17179571.992000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) [17179572.052000] Memory: 223968k/229376k available (1628k kernel code, 4992k reserved, 604k data, 232k init, 0k highmem) [17179572.064000] Checking if this processor honours the WP bit even in supervisor mode... Ok. [17179572.152000] Calibrating delay using timer specific routine.. 1068.40 BogoMIPS (lpj=2136811) [17179572.160000] Security Framework v1.0.0 initialized [17179572.164000] SELinux: Disabled at boot. [17179572.168000] Mount-cache hash table entries: 512 [17179572.172000] CPU: L1 I Cache: 64K (32 bytes/line), D cache 64K (32 bytes/line) [17179572.180000] CPU: L2 Cache: 64K (32 bytes/line) [17179572.184000] CPU: Centaur VIA Samuel 2 stepping 03 [17179572.192000] Checking 'hlt' instruction... OK. [17179572.268000] ACPI: setting ELCR to 0020 (from 0220) [17179572.276000] NET: Registered protocol family 16 [17179572.280000] EISA bus registered [17179572.284000] ACPI: bus type pci registered [17179572.340000] Unable to handle kernel paging request at virtual address 6de8d753 [17179572.340000] printing eip: [17179572.340000] c00fab46 [17179572.340000] *pde = 00000000 [17179572.340000] Oops: 0002 [#1] [17179572.340000] Modules linked in: [17179572.340000] CPU: 0 [17179572.340000] EIP: 0060:[<c00fab46>] Not tainted VLI [17179572.340000] EFLAGS: 00010087 (2.6.16.5 #3) [17179572.340000] EIP is at 0xc00fab46 [17179572.340000] eax: 00008701 ebx: 000f0000 ecx: 0000d5b4 edx: 00000000 [17179572.340000] esi: 00000206 edi: c02f9874 ebp: 000fa960 esp: c121ffb6 [17179572.340000] ds: 007b es: 0000 ss: 0068 [17179572.340000] Process swapper (pid: 1, threadinfo=c121e000 task=c1217a70) [17179572.340000] Stack: <0>c00fa99d 00000000 a2350006 0060c034 20440000 83c8c02b 0000c036 00000000 [17179572.340000] 00000000 03070000 0290c010 0000c010 10050000 0000c010 00000000 00000000 [17179572.340000] 00000000 00000000 [17179572.340000] Call Trace: [17179572.340000] Code: 00 f8 c3 87 db 52 50 e8 95 01 00 00 72 12 66 51 66 8b df 8a eb e8 23 02 00 00 66 59 8a c8 b4 00 8a d4 58 8a e2 5a 0a e4 74 01 f5 <d3> 90 52 50 e8 6d 01 10 00 72 1a 66 f7 c7 01 00 74 04 b4 87 eb [17179572.340000] <0>Kernel panic - not syncing: Attempted to kill init! [17179572.340000] 0