Joshua Wise joshua@joshuawise.com writes:
On Friday 19 December 2003 7:15 pm, Eric W. Biederman wrote:
Yes we are reaching the point where we can converge on some of these things. LAB might be the right framework. And if it is something good it will save me the trouble of starting my own project. But it takes more than a hyper active 2 year old to convince me. It might take a hyperactive 2 year old to remind me about interesting ideas though.
Right, well then you should see it in action. If you're in the Boston area sometime soon I can give you a demo on an iPAQ, perhaps.
Salt Lake City, and Illinois with my family for Christmas. Though a serial console logfile might be interesting.
512k is with a few ARM-specific drivers, and jffs2. It does not have networking. This is with kernel 2.6.
Hmm. I am pretty certain I have gotten 2.6 down some smaller. Our practical limit with LinuxBIOS etc is in the neighborhood of 384KB.
I've done 2.4 in 256k, but it's rather useless like that. If you do not plan to load modules at runtime, you can shave a good bit more off of it. If you write bzip2 compression support (or upport the stuff from kernel 2.4), you can shave even more off of it. I've pulled off 50k with bzip2 (not actually written the code, just did a bzip2 -9 < piggy > piggy.bz2).
The problem is that the bzip2 decompresser is huge, usually bzip2 is a net loss because of the decompresser. But it may be possible to write a tuned version. The cases I have typically worried about are much smaller and I have made huge gains by switching to nrv2b from upx because the decompresser is something like 100 bytes, and the compression is roughly as good as gzip.
If you don't plan to have a framebuffer, you can shave some off of it. If you don't plan to have jffs2 you can shave a lot more off. Little tidbits here and there make the world go 'round.
Quite true.
Well I think I have run finally convinced to use the MTD drivers... Mostly I prefer to flash from a production kernel rather than a bootloader, there are more recover options but anyway.
Ah yes, the ancient problem. Instead of read/modify/erase/write, it often turns into read/modify/erase/poweroff. That's Bad.
:)
I will see. Does LAB restrict it's kernel to a very small subset of memory? Or do you use something like kexec?
To boot a secondary kernel I use some code I wrote called armboot, although it's not very arm specific. It does something like this:
- Load the new kernel into a contiguous vmalloced block.
- We allocate 64k for a list of things that need to be relocated. We call
this a pointer of type "struct physlist", which is 32 bytes. It has four ints: the new address, the old address, the block size, and whether this is the last block. 2) In blocks of the maximum kmalloc size (these blocks have to be contiguous), we kmalloc space for the kernel, and memcpy the kernel into those blocks. We then fill in a struct physlist, and move on to the next struct physlist. We can do this because kmalloc is always contiguous, and we can always map it with virt_to_phys(). 3) We set up another kmalloced block for the tagged list of boot parameters that you need on ARM. 4) We set up one more kmalloced block and copy an assembler function into it, to make sure we don't wipe ourself out while relocating. 5) We flush our data caches. 6) We call the relocated assembler function, which turns off the MMU, jumps into the relocated assembler function's physical address, and does actual relocating. Then we jump into our newly moved zImage. Confused yet?
Nope. Having implemented something similar it sounds sane.
- If at any point we failed, the system could be in an inconsistent state.
You will want to panic() if you fail, because you're leaking memory like a sieve, and if you failed there's probably something bigger wrong.
Ouch.
This looks more difficult than it actually is. The C segment is only about 170 lines, and the assembler bit is 90 lines.
The reason that this works is that kmalloc should allocate from the top of memory down. You need a fair bit of ram - say, 8MB - to prevent the tail from running over other important structures, such as the list of addresses to relocate. But it seems to work well enough, and it looks like it should be fairly portable. The important code is in handhelds.org cvs, module linux/kernel26, files drivers/bootldr/armboot.c and drivers/bootldr/armboot-asm.S.
Ok. I need to get into that kernel tree and take a look. But it sounds similar to my kexec stuff. Which I discuss at least part of the time on fastboot@osdl.org. It sounds compatible enough that we could productively merge implementations, that plus my kexec stuff is still on Andrew Morton todo list ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/must-fix/should-fix-7.txt means it has a fair shot of getting into the stock kernel.
On a practical side I think I can boost it's priority high enough after I get back to actually do something.
A recent version of kexec patch is at: http://developer.osdl.org/rddunlap/kexec/
Kexec as it is currently structured is actually two system calls callable from user space.
sys_kexec_load() load the kernel into a linked list of pages, making certain that when those pages are copied to their final destination nothing will be stomped. And it allocates a chunk of memory with kmalloc for the bit of code that copies the kernel to it's final resting place. This can fail at any time and the system is in a consistent state.
sys_reboot(LINUX_REBOOT_CMD_KEXEC) initiates the transfer to the new kernel.
The new kernel is started in physical mode.
sys_kexec_load() is passed an entry point to jump to, and an array of physical destination address, virtual process space address, and virtual length regions to load. Which allows us to load arbitrary things.
The only requirement is that you have enough memory for both kernels simultaneously. For truly high end machines there are some other restrictions because physical mode does not allow access to all of their memory but anyway...
From the descriptions my kexec stuff is a little more general and a little
more robust than your armboot, so I'd like to merge your stuff into mine if possible. Now that 2.6.0 is out I can start sending patches again.
- I avoid deliberately avoid vmalloc, the vmalloc area is a limited resource, and I support loading large kernel ramdisk combinations. - I avoid greater than page size memory allocations because random memory fragmentation can make that fail. - I don't leak memory. - I have an implementation that works on SMP machines.
The worst part is getting all of the drivers to shut themselves down properly. But I have the appropriate hooks and it doesn't take an extension of the kernel api just lots of driver bug fixes.
If even the arm guys are up to using a kernel it should be easier to make progress in this array. On several projects the people I have talked to are two size constrained to even try using a kernel.
Eric