Eric wrote:
Take a look at the values in the segment list in elfboot.c and see if you can track down how it is getting corrupted. My hunch says it is most likely a mis setup of your DRAM controller.
I've tracked the problem to the function get_bounce_buffer in src/lib/elfboot.c, it is returning 5f646e8 for some reason. I added some printk's to that function and get this:
zzz mem_entries = 3, lb_size=9b918 zzz 1: mstart=6a8, msize=9f958, mend=a0000, tbuffer=46e8 zzz 2: mstart=100000, msize=5f00000, mend=6000000, tbuffer=5f646e8
Here is the debug output. I also changed printk_spew to printk_debug:
Found ELF candiate at offset 0 header_offset is 0 Try to load at offset 0x0 zzz mem_entries = 3, lb_size=9b918 zzz 1: mstart=6a8, msize=9f958, mend=a0000, tbuffer=46e8 zzz 2: mstart=100000, msize=5f00000, mend=6000000, tbuffer=5f646e8 n_type: 00000001 n_name(8): ELFBoot n_desc(10): Etherboot n_type: 00000002 n_name(8): ELFBoot n_desc(6): 5.2.2 Loading Etherboot version: 5.2.2 Dropping non PT_LOAD segment New segment addr 0x20000 size 0xefc0 offset 0xb0 filesize 0x5abc (cleaned up) New segment addr 0x20000 size 0xefc0 offset 0xb0 filesize 0x5abc lb: [0x0000000000004000, 0x0000000000051c8c) segment: [0x0000000000020000, 0x0000000000025abc, 0x000000000002efc0) buffer=5f646e8, seg->s_addr=20000, lb_start=4000 bounce: [0x0000000005f806e8, 0x0000000005f861a4, 0x0000000005f8f6a8) Loading Segment: addr: 0x0000000005f806e8 memsz: 0x000000000000efc0 filesz: 0x0c [ 0x0000000005f806e8, 0000000005f861a4, 0x0000000005f8f6a8) <- 00000000000000b0 Clearing Segment: addr: 0x0000000005f861a4 memsz: 0x0000000000009504 Loaded segments verified segments closed down stream Jumping to boot code at 0x20000 ROM segment 0x0004 length 0x0000 reloc 0x00020000
Here is the same debug output but with the 5.0.10 etherboot payload: Found ELF candiate at offset 0 header_offset is 0 Try to load at offset 0x0 zzz mem_entries = 3, lb_size=9b918 zzz 1: mstart=6a8, msize=9f958, mend=a0000, tbuffer=46e8 zzz 2: mstart=100000, msize=5f00000, mend=6000000, tbuffer=5f646e8 New segment addr 0x94000 size 0x6f68 offset 0x60 filesize 0x3460 (cleaned up) New segment addr 0x94000 size 0x6f68 offset 0x60 filesize 0x3460 lb: [0x0000000000004000, 0x0000000000051c8c) Loading Segment: addr: 0x0000000000094000 memsz: 0x0000000000006f68 filesz: 0x00 [ 0x0000000000094000, 0000000000097460, 0x000000000009af68) <- 0000000000000060 Clearing Segment: addr: 0x0000000000097460 memsz: 0x0000000000003b08 Loaded segments verified segments closed down stream Jumping to boot code at 0x94000 ROM segment 0x0004 length 0x0000 reloc 0x9400 Etherboot 5.0.10 (GPL) ELF for [VIA 86C100]
The 5.2.2 elf file exercises more of the elfboot.c. The bounce buffer should be in valid memory, there is 128M total minus 32M for video, which now that I think about it is way more than we need. So 0x6000000 total memory.
I'll keep digging...
-Dave
Dave Ashley linuxbios@xdr.com writes:
Eric wrote:
Take a look at the values in the segment list in elfboot.c and see if you can track down how it is getting corrupted. My hunch says it is most likely a mis setup of your DRAM controller.
I've tracked the problem to the function get_bounce_buffer in src/lib/elfboot.c, it is returning 5f646e8 for some reason. I added some printk's to that function and get this:
zzz mem_entries = 3, lb_size=9b918 zzz 1: mstart=6a8, msize=9f958, mend=a0000, tbuffer=46e8 zzz 2: mstart=100000, msize=5f00000, mend=6000000, tbuffer=5f646e8
Duh. Etherboot is wanting the same addresses that linuxBIOS is using, so I allocate a bounce buffer at the top of memory.
The 5.2.2 elf file exercises more of the elfboot.c. The bounce buffer should be in valid memory, there is 128M total minus 32M for video, which now that I think about it is way more than we need. So 0x6000000 total memory.
On my box that address does not trigger the bounce buffer code. But I know I tested it ages ago when I added it.
So either there is a bug in the bounce buffer code, or in the hand off to etherboot.
Easy things to try. 1) load the new etherboot with the old etherboot, over the network. That should confirm that the code actually works. 2) Play with RELOCADDR in etherboot so we don't trigger the bounce buffer case.
Eric