Hello,
Not sure if this is the right place for this question but I'm not sure where else to send the message.
I'm having trouble booting a Linux kernel/initrd image. My original goal was:
LinuxBIOS->Etherboot->Linux (single mkelfImage kernel/initrd)
In the course of troubleshooting this problem I'm working with one of two configurations:
Original BIOS->Etherboot via Floppy->Linux (single mkelfImage kernel/initrd).
and
Original BIOS->ISOLinux->Linux (separate kernel and initrd).
Booting the kernel/initrd via the CD-ROM works great. Booting via Etherboot doesn't. I get kernel panics indicating that the kernel can't mount its root fs.
I've modified the mkelfImage convert_params.c file to print out everything. Just before the "Jumping to Linux" line, I added code to print out the first 512 bytes of the initrd image. The values printed match perfectly with the first 512 bytes of my initrd image. The kernel table values for the initrd also look good: ramdisk at 0x800000.
I also added code in the Linux kernel do_mounts.c file (just after the "RAMDISK: could not find valid ramdisk image") to print out the first 512 bytes of the buf buffer. At this point, the values in no way match the values in the initrd image (nor to those printed out in convert_params.c).
Anyone got any insight on what might be happening here or how to continue debugging this?
Thanks,
-don
On Mon, 22 Mar 2004, Don Elwell wrote:
LinuxBIOS->Etherboot->Linux (single mkelfImage kernel/initrd)
we do this now.
Booting the kernel/initrd via the CD-ROM works great. Booting via Etherboot doesn't. I get kernel panics indicating that the kernel can't mount its root fs.
can you send the string?
I wonder if by some chance etherboot is doing something to IDE that makes it work "correctly" for the kernel in the CD-ROM case.
I also added code in the Linux kernel do_mounts.c file (just after the "RAMDISK: could not find valid ramdisk image") to print out the first 512 bytes of the buf buffer. At this point, the values in no way match the values in the initrd image (nor to those printed out in convert_params.c).
hmm. etherboot problem? What kind of ethernet chip?
ron
Thanks Ron,
Attached is the serial dump of what is happening -- stuff in double braces [[ ]] are notes from me. From the attached logs and additional debug prints in the kernel and the convert_params.c file I'm pretty convinced its *not* mkelfImage. I think mkelfImage is doing exactly what its supposed to do.
I think it still could be Etherboot but I think its more likely something with the kernel and/or the version of the kernel I'm using. I've come to this conclusion because the initrd data (stuff at 0x800000) just before the mkelfImage/convert_params.c handoff to Linux is perfect. I'm in the process of building new and different kernel versions to see if it makes a difference. What version of the kernel have you used successfully with Etherboot/single mkelfImage?
Perhaps you'll see something that I don't. I'm starting to have "can't see the forest from the trees" issues...
-don
ron minnich wrote:
On Mon, 22 Mar 2004, Don Elwell wrote:
LinuxBIOS->Etherboot->Linux (single mkelfImage kernel/initrd)
we do this now.
Booting the kernel/initrd via the CD-ROM works great. Booting via Etherboot doesn't. I get kernel panics indicating that the kernel can't mount its root fs.
can you send the string?
I wonder if by some chance etherboot is doing something to IDE that makes it work "correctly" for the kernel in the CD-ROM case.
I also added code in the Linux kernel do_mounts.c file (just after the "RAMDISK: could not find valid ramdisk image") to print out the first 512 bytes of the buf buffer. At this point, the values in no way match the values in the initrd image (nor to those printed out in convert_params.c).
hmm. etherboot problem? What kind of ethernet chip?
ron
ROM segment 0x0000 length 0x0000 reloc 0x00020000 Etherboot 5.2.2 (GPL) http://etherboot.org Tagged ELF (Multiboot) for [3C90X] Relocating _text from: [00014240,00023bf0) to [0f6f0650,0f700000) Boot from (N)etwork or (Q)uit? N
Probing pci nic... [3c9805]
3C90X Driver 2.00 Copyright 1999 LightSys Technology Services, Inc. Portions Copyright 1999 Steve Smith Provided with ABSOLUTELY NO WARRANTY. ------------------------------------------------------------------------------- MAC Address = 00:01:03:D6:46:4A Connectors present: 10Base-T / 100Base-TX. Searching for server (DHCP)... ..Me: 192.168.0.230, Server: 192.168.0.13, Gateway 192.168.0.1 Loading 192.168.0.13:dir911t.sys ...(ELF)... [[dots removed]] .....done Issuing RESET: hello world count_elf_notes 0f6fe2f0 elf_note = 0f6fe2fc elf_namesz = 0000000a elf_descsz = 00000004 elf_type = 00000005 elf_name = 0f6fe308 elf_desc = 0f6fe314 elf_note = 0f6fe318 elf_namesz = 0000000a elf_descsz = 00000004 elf_type = 00000006 elf_name = 0f6fe324 elf_desc = 0f6fe330 elf_note = 0f6fe334 elf_namesz = 0000000a elf_descsz = 0000028c elf_type = 00000008 elf_name = 0f6fe340 elf_desc = 0f6fe34c elf_note = 0f6fe5d8 elf_namesz = 00000000 elf_descsz = 0000000a elf_type = 00000002 elf_name = 0f6fe5e4 elf_desc = 0f6fe5e4 elf_note = 0f6fe5f0 elf_namesz = 00000000 elf_descsz = 00000006 elf_type = 00000003 elf_name = 0f6fe5fc elf_desc = 0f6fe5fc elf_note = 0f6fe604 elf_namesz = 00000000 elf_descsz = 00000007 elf_type = 00000001 elf_name = 0f6fe610 elf_desc = 0f6fe610 elf_note = 0f6fe618 elf_namesz = 00000000 elf_descsz = 00000080 elf_type = 00000006 elf_name = 0f6fe624 elf_desc = 0f6fe624 elf_note = 0f6fe6a4 elf_namesz = 00000000 elf_descsz = 00000001 elf_type = 00000004 elf_name = 0f6fe6b0 elf_desc = 0f6fe6b0 Firmware type: PCBIOS EXT_MEM_K=00000156 ALT_MEM_K=0001013a print_offsets orig_x =00000000 orig_y =00000001 ext_mem_k =00000002 orig_video_page =00000004 orig_video_mode =00000006 orig_video_cols =00000007 unused2 =00000008 orig_video_ega_bx =0000000a unused3 =0000000c orig_video_lines =0000000e orig_video_isVGA =0000000f orig_video_points =00000010 lfb_width =00000012 lfb_height =00000014 lfb_depth =00000016 lfb_base =00000018 lfb_size =0000001c cl_magic =00000020 cl_offset =00000022 lfb_linelength =00000024 red_size =00000026 red_pos =00000027 green_size =00000028 green_pos =00000029 blue_size =0000002a blue_pos =0000002b rsvd_size =0000002c rsvd_pos =0000002d vesapm_seg =0000002e vesapm_off =00000030 pages =00000032 reserved4 =00000034 apm_bios_info =00000040 drive_info =00000080 sys_desc_table =000000a0 alt_mem_k =000001e0 reserved5 =000001e4 e820_map_nr =000001e8 reserved6 =000001e9 mount_root_rdonly =000001f2 reserved7 =000001f4 ramdisk_flags =000001f8 reserved8 =000001fa orig_root_dev =000001fc reserved9 =000001fe aux_device_info =000001ff reserved10 =00000200 param_block_signature=00000202 param_block_version =00000206 reserved11 =00000208 loader_type =00000210 loader_flags =00000211 reserved12 =00000212 kernel_start =00000214 initrd_start =00000218 initrd_size =0000021c reserved13 =00000220 e820_map =000002d0 reserved16 =00000550 command_line =00000800 reserved17 =00000900 print_linux_params orig_x =00000000 orig_y =00000019 orig_video_page =00000000 orig_video_mode =00000000 orig_video_cols =00000050 orig_video_lines =00000019 orig_video_ega_bx=00000000 orig_video_isVGA =00000001 orig_video_points=00000010 sys_dest_table_len=00000000 ext_mem_k =00000156 alt_mem_k =0001013a e820_map_nr =00000007 addr[00000000] =0000000000000000 size[00000000] =000000000009fc00 type[00000000] =0001631400000001 addr[00000001] =000000000009fc00 size[00000001] =0000000000000400 type[00000001] =0001631400000002 addr[00000002] =00000000000f0000 size[00000002] =0000000000010000 type[00000002] =0001631400000002 addr[00000003] =0000000000100000 size[00000003] =000000000f6f0000 type[00000003] =0001631400000001 addr[00000004] =000000000f7f0000 size[00000004] =0000000000003000 type[00000004] =0001631400000004 addr[00000005] =000000000f7f3000 size[00000005] =000000000000d000 type[00000005] =0001631400000003 addr[00000006] =00000000fec00000 size[00000006] =0000000001400000 type[00000006] =0001631400000002 addr[00000007] =0000000000000000 size[00000007] =0000000000000000 type[00000007] =0001631400000000 addr[00000008] =0000000000000000 size[00000008] =0000000000000000 type[00000008] =0001631400000000 addr[00000009] =0000000000000000 size[00000009] =0000000000000000 type[00000009] =0001631400000000 addr[0000000a] =0000000000000000 size[0000000a] =0000000000000000 type[0000000a] =0001631400000000 addr[0000000b] =0000000000000000 size[0000000b] =0000000000000000 type[0000000b] =0001631400000000 addr[0000000c] =0000000000000000 size[0000000c] =0000000000000000 type[0000000c] =0001631400000000 addr[0000000d] =0000000000000000 size[0000000d] =0000000000000000 type[0000000d] =0001631400000000 addr[0000000e] =0000000000000000 size[0000000e] =0000000000000000 type[0000000e] =0001631400000000 addr[0000000f] =0000000000000000 size[0000000f] =0000000000000000 type[0000000f] =0001631400000000 addr[00000010] =0000000000000000 size[00000010] =0000000000000000 type[00000010] =0001631400000000 addr[00000011] =0000000000000000 size[00000011] =0000000000000000 type[00000011] =0001631400000000 addr[00000012] =0000000000000000 size[00000012] =0000000000000000 type[00000012] =0001631400000000 addr[00000013] =0000000000000000 size[00000013] =0000000000000000 type[00000013] =0001631400000000 addr[00000014] =0000000000000000 size[00000014] =0000000000000000 type[00000014] =0001631400000000 addr[00000015] =0000000000000000 size[00000015] =0000000000000000 type[00000015] =0001631400000000 addr[00000016] =0000000000000000 size[00000016] =0000000000000000 type[00000016] =0001631400000000 addr[00000017] =0000000000000000 size[00000017] =0000000000000000 type[00000017] =0001631400000000 addr[00000018] =0000000000000000 size[00000018] =0000000000000000 type[00000018] =0001631400000000 addr[00000019] =0000000000000000 size[00000019] =0000000000000000 type[00000019] =0001631400000000 addr[0000001a] =0000000000000000 size[0000001a] =0000000000000000 type[0000001a] =0001631400000000 addr[0000001b] =0000000000000000 size[0000001b] =0000000000000000 type[0000001b] =0001631400000000 addr[0000001c] =0000000000000000 size[0000001c] =0000000000000000 type[0000001c] =0001631400000000 addr[0000001d] =0000000000000000 size[0000001d] =0000000000000000 type[0000001d] =0001631400000000 addr[0000001e] =0000000000000000 size[0000001e] =0000000000000000 type[0000001e] =0001631400000000 addr[0000001f] =0000000000000000 size[0000001f] =0000000000000000 type[0000001f] =0001631400000000 mount_root_rdonly=0000ffff ramdisk_flags =00000000 orig_root_dev =00000300 aux_device_info =00000000 param_block_signature=53726448 loader_type =00000050 loader_flags =00000000 initrd_start =00800000 initrd_size =0011c08e cl_magic =0000a33f cl_offset =00000800 command_line =ramdisk_size=40960 root=/dev/ram0 devfs=nomount console=ttyS0,115200 rw
[[ Here is where I dump the first 512 bytes at 0x800000. They match perfectly with the first 512 bytes of my initrd image ]]
Jumping to Linux Linux version 2.4.25-lck1 (root@pzdev3) (gcc version 3.2 20020903 (Red Hat Linux 8.0 3.2-7)) #20 Mon Mar 22 17:22:41 EST 2004 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000000f7f0000 (usable) BIOS-e820: 000000000f7f0000 - 000000000f7f3000 (ACPI NVS) BIOS-e820: 000000000f7f3000 - 000000000f800000 (ACPI data) BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) 247MB LOWMEM available.
[[ The next 4 lines are debug statements inserted by me ]] INITRD_START = 0x00800000 INITRD_SIZE = 0x0011C08E initrd_start = 0xC0800000 initrd_end = 0xC091C08E
On node 0 totalpages: 63472 zone(0): 4096 pages. zone(1): 59376 pages. zone(2): 0 pages. Kernel command line: ramdisk_size=40960 root=/dev/ram0 devfs=nomount console=ttyS0,115200 rw Found and enabled local APIC! Initializing CPU#0 Detected 2000.010 MHz processor. Console: colour VGA+ 80x25 Calibrating delay loop... 3973.12 BogoMIPS Memory: 243820k/253888k available (3850k kernel code, 9680k reserved, 1526k data, 160k init, 0k highmem) Dentry cache hash table entries: 32768 (order: 6, 262144 bytes) Inode cache hash table entries: 16384 (order: 5, 131072 bytes) Mount cache hash table entries: 512 (order: 0, 4096 bytes) Buffer cache hash table entries: 16384 (order: 4, 65536 bytes) Page-cache hash table entries: 65536 (order: 6, 262144 bytes) CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 128K Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU: Intel(R) Celeron(R) CPU 2.00GHz stepping 07 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX enabled ExtINT on CPU#0 ESR value before enabling vector: 00000000 ESR value after enabling vector: 00000000 Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 1999.1205 MHz. ..... host bus clock speed is 99.1960 MHz. cpu: 0, clocks: 199960, slice: 99980 CPU0T0:199952,T1:99968,D:4,S:99980,C:199960 mtrr: v1.40 (20010327) Richard Gooch (rgooch@atnf.csiro.au) mtrr: detected mtrr type: Intel PCI: PCI BIOS revision 2.10 entry at 0xfb430, last bus=1 PCI: Using configuration type 1 PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) PCI: Ignoring BAR0-3 of IDE controller 00:1f.1 Transparent bridge - Intel Corp. 82801BA/CA/DB/EB PCI Bridge PCI: Using IRQ router PIIX/ICH [8086/24c0] at 00:1f.0 PCI: Found IRQ 11 for device 00:1f.1 PCI: Sharing IRQ 11 with 00:02.0 PCI: Sharing IRQ 11 with 01:00.0 isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found Linux NET4.0 for Linux 2.4 Based upon Swansea University Computer Society NET3.039 Initializing RT netlink socket Starting kswapd Journalled Block Device driver loaded Coda Kernel/Venus communications, v5.3.18, coda@cs.cmu.edu devfs: v1.12c (20020818) Richard Gooch (rgooch@atnf.csiro.au) devfs: boot_options: 0x0 Installing knfsd (copyright (C) 1996 okir@monad.swb.de). udf: registering filesystem SGI XFS with no debug enabled i2c-core.o: i2c core module version 2.6.1 (20010830) i2c-dev.o: i2c /dev entries driver module version 2.6.1 (20010830) i2c-proc.o version 2.6.1 (20010830) Detected PS/2 Mouse Port. pty: 256 Unix98 ptys configured Serial driver version 5.05c (2001-07-08) with MANY_PORTS SHARE_IRQ SERIAL_PCI ISAPNP enabled ttyS00 at 0x03f8 (irq = 4) is a 16550A ttyS01 at 0x02f8 (irq = 3) is a 16550A Non-volatile memory driver v1.2 Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 RAMDISK driver initialized: 16 RAM disks of 40960K size 1024 blocksize loop: loaded (max 8 devices) PCI: Found IRQ 11 for device 01:00.0 PCI: Sharing IRQ 11 with 00:02.0 PCI: Sharing IRQ 11 with 00:1f.1 3c59x: Donald Becker and others. www.scyld.com/network/vortex.html See Documentation/networking/vortex.txt 01:00.0: 3Com PCI 3c980C Python-T at 0xc000. Vers LK1.1.18-ac 00:01:03:d6:46:4a, IRQ 11 product code 4b50 rev 00.14 date 02-16-01 8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface. MII transceiver found at address 24, status 782d. Enabling bus-master transmits and whole-frame receives. 01:00.0: scatter/gather enabled. h/w checksums enabled Linux agpgart interface v0.99 (c) Jeff Hartmann agpgart: Maximum main memory to use for agp memory: 196M agpgart: Detected an Intel(R) 845G Chipset. agpgart: Detected 8060K stolen memory. agpgart: AGP aperture is 128M @ 0xd0000000 [drm] Initialized tdfx 1.0.0 20010216 on minor 0 [drm] AGP 0.99 Aperture @ 0xd0000000 128MB [drm] Initialized radeon 1.7.0 20020828 on minor 1 [drm] AGP 0.99 Aperture @ 0xd0000000 128MB [drm] Initialized i810 1.2.1 20020211 on minor 2 Uniform Multi-Platform E-IDE driver Revision: 7.00beta4-2.4 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx ICH4: IDE controller at PCI slot 00:1f.1 PCI: Found IRQ 11 for device 00:1f.1 PCI: Sharing IRQ 11 with 00:02.0 PCI: Sharing IRQ 11 with 01:00.0 ICH4: chipset revision 2 ICH4: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:pio, hdb:pio ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:pio hdc: AOpen 12X DVD-ROM/ASH 01112001, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 hdc: attached ide-cdrom driver. hdc: ATAPI 40X DVD-ROM drive, 512kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.12 SCSI subsystem driver Revision: 1.00 scsi0 : SCSI host adapter emulation for IDE ATAPI devices raw1394: /dev/raw1394 device initialized sbp2: $Rev: 1074 $ Ben Collins bcollins@debian.org ALI 5455 + AC97 Audio, version 0.03-ac, 23:36:08 Mar 18 2004 Intel 810 + AC97 Audio, version 0.24, 23:36:11 Mar 18 2004 PCI: Found IRQ 10 for device 00:1f.5 PCI: Sharing IRQ 10 with 00:1f.3 i810: Intel ICH4 found at IO 0xe400 and 0xe000, MEM 0xde081000 and 0xde082000, IRQ 10 i810: Intel ICH4 mmio at 0xd00e5000 and 0xd00e7000 i810_audio: Primary codec has ID 2 i810_audio: Audio Controller supports 6 channels. i810_audio: Defaulting to base 2 channel mode. i810_audio: Resetting connection 0 i810_audio: Connection 0 with codec id 2 ac97_codec: AC97 Audio codec, id: ALG48 (Unknown) i810_audio: only 48Khz playback available. i810_audio: AC'97 codec 2 supports AMAP, total channels = 2 es1371: version v0.32 time 23:36:13 Mar 18 2004 Linux Kernel Card Services 3.1.22 options: [pci] [cardbus] usb.c: registered new driver usbdevfs usb.c: registered new driver hub host/uhci.c: USB Universal Host Controller Interface driver v1.1 host/uhci.c: USB UHCI at I/O 0xd800, IRQ 11 usb.c: new USB bus registered, assigned bus number 1 hub.c: USB hub found hub.c: 2 ports detected host/uhci.c: USB UHCI at I/O 0xd000, IRQ 11 usb.c: new USB bus registered, assigned bus number 2 hub.c: USB hub found hub.c: 2 ports detected host/uhci.c: USB UHCI at I/O 0xd400, IRQ 9 usb.c: new USB bus registered, assigned bus number 3 hub.c: USB hub found hub.c: 2 ports detected usb.c: registered new driver hid hid-core.c: v1.8.1 Andreas Gal, Vojtech Pavlik vojtech@suse.cz hid-core.c: USB HID support drivers usb.c: registered new driver audio audio.c: v1.0.0:USB Audio Class driver pegasus.c: v0.4.32 (2003/06/06):Pegasus/Pegasus II USB Ethernet driver usb.c: registered new driver pegasus rtl8150.c: rtl8150 based usb-ethernet driver v0.4.3 (2002/12/31) usb.c: registered new driver rtl8150 catc.c: v2.8 CATC EL1210A NetMate USB Ethernet driver usb.c: registered new driver catc usb.c: registered new driver kaweth usb.c: registered new driver CDCEther bluetooth.c: USB Bluetooth support registered usb.c: registered new driver bluetty bluetooth.c: USB Bluetooth tty driver v0.13 usb.c: registered new driver serial usbserial.c: USB Serial support registered for Generic usbserial.c: USB Serial Driver core v1.4 usbserial.c: USB Serial support registered for PL-2303 pl2303.c: Prolific PL2303 USB to serial adaptor driver v0.10 Initializing USB Mass Storage driver... usb.c: registered new driver usb-storage USB Mass Storage support registered. mice: PS/2 mouse device common for all mice NET4: Linux TCP/IP 1.0 for NET4.0 IP Protocols: ICMP, UDP, TCP, IGMP IP: routing cache hash table of 2048 buckets, 16Kbytes TCP: Hash tables configured (established 16384 bind 32768) NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. ds: no socket drivers loaded! RAMDISK: Couldn't find valid RAM disk image starting at 0.
[[ This line was added by me to find out where in VM the kernel thinks the initrd resides ]].
RAMDISK: the ramdisk image starts at 0xC0800000 (-1065353216)
[[ Here is where I print out the first 512 bytes of the ramdisk (i.e. location 0xC0800000 -> 0xC08001FF). The data in no way looks like data in the initrd image ]]
Freeing initrd memory: 1136k freed FAT: bogus logical sector size 0 UMSDOS: msdos_read_super failed, mount aborted. FAT: bogus logical sector size 0 FAT: bogus logical sector size 0 ufs was compiled with read-only support, can't be mounted as read-write UDF-fs: No partition found (1) sh-2021: reiserfs_read_super: can not find reiserfs on ramdisk(1,0) XFS: bad magic number XFS: SB validate failed Kernel panic: VFS: Unable to mount root fs on 01:00
well your output is very puzzling. We do this all the time, although I have not yet tried 2.4.25. I think the last time I did this was 2.4.22
It's almost as though something in the kernel is trashing the ramdisk when it starts up.
But you say it did work from cdrom, which makes this even weirder.
I hope it is not some random DMA from ethernet landing sometime after kernel starts, but it sounds too reproducible.
I think you're stuck with some more etherboot debugging. Do you have to use net boot?
ron
ron minnich wrote:
well your output is very puzzling. We do this all the time, although I have not yet tried 2.4.25. I think the last time I did this was 2.4.22
The exact same setup (identical initrd image) works perfectly with 2.4.22 *and* 2.6.0-test3 kernels. I'll be building and trying a plain (i.e. from www.kernel.org) 2.4.25 kernel this evening (the version I've been testing with is the -lck1 performance patches -- which themselves could be causing the problem).
It's almost as though something in the kernel is trashing the ramdisk when it starts up.
But you say it did work from cdrom, which makes this even weirder.
Yes, the errant kernel (2.4.25) will boot from CD-ROM using ISOLINUX (separate kernel/initrd images). That identical image will not boot using Etherboot (single kernel/initrd image made with mkelfImage). By simply replacing the kernel in the mkelfImage to either 2.4.22 or 2.6.0-test3 the system boots normally.
I hope it is not some random DMA from ethernet landing sometime after kernel starts, but it sounds too reproducible.
I agree. This happens exactly the same time, every time. Coupled with the fact that the other kernels work, I think Etherboot and mkelfImage are doing what they are supposed to do.
I think you're stuck with some more etherboot debugging. Do you have to use net boot?
The systems we're building only have Ethernet/serial to the outside world (they do have a CF internal). My thought was to have them net boot in their production test that (in addition to running production tests on the system) also puts the correct application image(s) on the CF. Also, I didn't want the techs to have to take apart the units to re-image the CF.
But, PROGRESS!!!!! I'm not married to the 2.4.25 kernel in any way. I could just as easily use 2.4.22 (which we use in other systems now anyway). (unfortunately :-) ) My curiosity has been piqued -- I gotta know whats causing this!
More to come.
-don
The saga continues:
I have more information on this problem. A stock 2.4.25 kernel from www.kernel.org does not work. A stock 2.4.24 kernel from www.kernel.org does work. They were compiled with identical .config files (I'd be happy to send them along for anyone curious).
Here is where I'm at:
STD BIOS->Etherboot->Linux + Initrd (single kernel from mkelfImage) works great for all kernels I've tried except 2.4.25. This includes 2.4.21 and 2.4.22. It also includes 2.6.0-test3.
STD BIOS->ISOLINUX->Linux + Initrd (separate files) works for all kernels, even 2.4.25.
I'm going to try to continue down this path of assuming its the difference between the two kernels. Anyone care to venture a guess on where to start looking?
-don
Don Elwell wrote:
ron minnich wrote:
well your output is very puzzling. We do this all the time, although I have not yet tried 2.4.25. I think the last time I did this was 2.4.22
The exact same setup (identical initrd image) works perfectly with 2.4.22 *and* 2.6.0-test3 kernels. I'll be building and trying a plain (i.e. from www.kernel.org) 2.4.25 kernel this evening (the version I've been testing with is the -lck1 performance patches -- which themselves could be causing the problem).
It's almost as though something in the kernel is trashing the ramdisk when it starts up. But you say it did work from cdrom, which makes this even weirder.
Yes, the errant kernel (2.4.25) will boot from CD-ROM using ISOLINUX (separate kernel/initrd images). That identical image will not boot using Etherboot (single kernel/initrd image made with mkelfImage). By simply replacing the kernel in the mkelfImage to either 2.4.22 or 2.6.0-test3 the system boots normally.
I hope it is not some random DMA from ethernet landing sometime after kernel starts, but it sounds too reproducible.
I agree. This happens exactly the same time, every time. Coupled with the fact that the other kernels work, I think Etherboot and mkelfImage are doing what they are supposed to do.
I think you're stuck with some more etherboot debugging. Do you have to use net boot?
The systems we're building only have Ethernet/serial to the outside world (they do have a CF internal). My thought was to have them net boot in their production test that (in addition to running production tests on the system) also puts the correct application image(s) on the CF. Also, I didn't want the techs to have to take apart the units to re-image the CF.
But, PROGRESS!!!!! I'm not married to the 2.4.25 kernel in any way. I could just as easily use 2.4.22 (which we use in other systems now anyway). (unfortunately :-) ) My curiosity has been piqued -- I gotta know whats causing this!
More to come.
-don