The saga continues:
I have more information on this problem. A stock 2.4.25 kernel from www.kernel.org does not work. A stock 2.4.24 kernel from www.kernel.org does work. They were compiled with identical .config files (I'd be happy to send them along for anyone curious).
Here is where I'm at:
STD BIOS->Etherboot->Linux + Initrd (single kernel from mkelfImage) works great for all kernels I've tried except 2.4.25. This includes 2.4.21 and 2.4.22. It also includes 2.6.0-test3.
STD BIOS->ISOLINUX->Linux + Initrd (separate files) works for all kernels, even 2.4.25.
I'm going to try to continue down this path of assuming its the difference between the two kernels. Anyone care to venture a guess on where to start looking?
-don
Don Elwell wrote:
ron minnich wrote:
well your output is very puzzling. We do this all the time, although I have not yet tried 2.4.25. I think the last time I did this was 2.4.22
The exact same setup (identical initrd image) works perfectly with 2.4.22 *and* 2.6.0-test3 kernels. I'll be building and trying a plain (i.e. from www.kernel.org) 2.4.25 kernel this evening (the version I've been testing with is the -lck1 performance patches -- which themselves could be causing the problem).
It's almost as though something in the kernel is trashing the ramdisk when it starts up. But you say it did work from cdrom, which makes this even weirder.
Yes, the errant kernel (2.4.25) will boot from CD-ROM using ISOLINUX (separate kernel/initrd images). That identical image will not boot using Etherboot (single kernel/initrd image made with mkelfImage). By simply replacing the kernel in the mkelfImage to either 2.4.22 or 2.6.0-test3 the system boots normally.
I hope it is not some random DMA from ethernet landing sometime after kernel starts, but it sounds too reproducible.
I agree. This happens exactly the same time, every time. Coupled with the fact that the other kernels work, I think Etherboot and mkelfImage are doing what they are supposed to do.
I think you're stuck with some more etherboot debugging. Do you have to use net boot?
The systems we're building only have Ethernet/serial to the outside world (they do have a CF internal). My thought was to have them net boot in their production test that (in addition to running production tests on the system) also puts the correct application image(s) on the CF. Also, I didn't want the techs to have to take apart the units to re-image the CF.
But, PROGRESS!!!!! I'm not married to the 2.4.25 kernel in any way. I could just as easily use 2.4.22 (which we use in other systems now anyway). (unfortunately :-) ) My curiosity has been piqued -- I gotta know whats causing this!
More to come.
-don