Hello all,
I could use some of your insight into the elfboot stage. Appended to this message are two printouts of the elfboot process for 128MB RAM (which comes to a halt), and for 64MB RAM (which succeeds). It is jmp_to_elf_entry() in
src/arch/i386/boot/boot.c
that is unable to complete the set task for 128MB RAM. I suspect that the problem lies in the value of 'bounce_buffer' that is determined in src/boot/elfboot.c, but I fail to identify why a large value is causing any problem at all. The informational "Candidate bounce_buffer" is my addition at the loop end in get_bounce_buffer().
Since FILO never starts with 128MB, I cannot verify that the linuxbios_table indeed gets the expected last entry
"convert_memmap: 0x00000000100000 0x00000007f00000 1",
which the debugging output for FILO for 64MB would suggest.
A slightly provocative question:
Is the elfboot process implicitly hardcoded to depend on no more than 64MB RAM? My setup makes sure that PAM#, RPS, and PGPOL are correct (to my understanding) for distribution of memory: 64MB, 64MB+64MB, 64MB+0MB+64MB, etc.
I will be much obliged for any pointers. They will all help me to understand the code base better than presently.
Best regards,
Mats E Andersson
## ## 64MB + 64MB RAM ##
Moving GDT to 0x500...ok Adjust low_table_end from 0x00000530 to 0x00001000 Adjust rom_table_end from 0x000f0400 to 0x00100000 Wrote coreboot table at: 00000530 - 00000734 checksum ed60
elfboot: Attempting to load payload. rom_stream: 0xfffe0000 - 0xfffeffff Found ELF candidate at offset 0 header_offset is 0 Try to load at offset 0x0 Candidate bounce_buffer: 0x0000054000. Candidate bounce_buffer: 0x0007fb4000. malloc Enter, size 32, free_mem_ptr 0002757c malloc 0x0002757c New segment addr 0x100000 size 0x36000 offset 0xc0 filesize 0xcb88 (cleaned up) New segment addr 0x100000 size 0x36000 offset 0xc0 filesize 0xcb88 lb: [0x0000000000004000, 0x000000000002a000) malloc Enter, size 32, free_mem_ptr 0002759c malloc 0x0002759c New segment addr 0x136000 size 0x48 offset 0xcc60 filesize 0x48 (cleaned up) New segment addr 0x136000 size 0x48 offset 0xcc60 filesize 0x48 lb: [0x0000000000004000, 0x000000000002a000) Dropping non PT_LOAD segment Dropping non PT_LOAD segment Loading Segment: addr: 0x0000000000100000 memsz: 0x0000000000036000 filesz: 0x000000000000cb88 [ 0x0000000000100000, 000000000010cb88, 0x0000000000136000) <- 00000000000000c0 Clearing Segment: addr: 0x000000000010cb88 memsz: 0x0000000000029478 Loading Segment: addr: 0x0000000000136000 memsz: 0x0000000000000048 filesz: 0x0000000000000048 [ 0x0000000000136000, 0000000000136048, 0x0000000000136048) <- 000000000000cc60 Loaded segments verified segments closed down stream Jumping to boot code at 0x100078 entry = 0x00100078 lb_start = 0x00004000 lb_size = 0x00026000 buffer = 0x07fb4000 adjust = 0x07fd6000 elf_boot_notes = 0x0001f640 adjusted_boot_notes = 0x07ff5640
## The system is stuck here! ## ## Next case:
## ## 64MB ##
Moving GDT to 0x500...ok Adjust low_table_end from 0x00000530 to 0x00001000 Adjust rom_table_end from 0x000f0400 to 0x00100000 Wrote coreboot table at: 00000530 - 00000734 checksum ab60
elfboot: Attempting to load payload. rom_stream: 0xfffe0000 - 0xfffeffff Found ELF candidate at offset 0 header_offset is 0 Try to load at offset 0x0 Candidate bounce_buffer: 0x0000054000. Candidate bounce_buffer: 0x0003fb4000. malloc Enter, size 32, free_mem_ptr 0002757c malloc 0x0002757c New segment addr 0x100000 size 0x36000 offset 0xc0 filesize 0xcb88 (cleaned up) New segment addr 0x100000 size 0x36000 offset 0xc0 filesize 0xcb88 lb: [0x0000000000004000, 0x000000000002a000) malloc Enter, size 32, free_mem_ptr 0002759c malloc 0x0002759c New segment addr 0x136000 size 0x48 offset 0xcc60 filesize 0x48 (cleaned up) New segment addr 0x136000 size 0x48 offset 0xcc60 filesize 0x48 lb: [0x0000000000004000, 0x000000000002a000) Dropping non PT_LOAD segment Dropping non PT_LOAD segment Loading Segment: addr: 0x0000000000100000 memsz: 0x0000000000036000 filesz: 0x000000000000cb88 [ 0x0000000000100000, 000000000010cb88, 0x0000000000136000) <- 00000000000000c0 Clearing Segment: addr: 0x000000000010cb88 memsz: 0x0000000000029478 Loading Segment: addr: 0x0000000000136000 memsz: 0x0000000000000048 filesz: 0x0000000000000048 [ 0x0000000000136000, 0000000000136048, 0x0000000000136048) <- 000000000000cc60 Loaded segments verified segments closed down stream Jumping to boot code at 0x100078 entry = 0x00100078 lb_start = 0x00004000 lb_size = 0x00026000 buffer = 0x03fb4000 adjust = 0x03fd6000 elf_boot_notes = 0x0001f640 adjusted_boot_notes = 0x03ff5640 FILO version 0.5.6 (mats@asus) Thu Sep 4 22:51:36 CEST 2008 collect_linuxbios_info: Searching for LinuxBIOS tables... find_lb_table: Found candidate at: 00000530 find_lb_table: header checksum o.k. find_lb_table: table checksum o.k. find_lb_table: record count o.k. collect_linuxbios_info: Found LinuxBIOS table at: 00000530 convert_memmap: 0x00000000000000 0x00000000001000 16 convert_memmap: 0x00000000001000 0x0000000009f000 1 convert_memmap: 0x000000000c0000 0x00000000030000 1 convert_memmap: 0x000000000f0000 0x00000000010000 16 convert_memmap: 0x00000000100000 0x00000003f00000 1 Press <Enter> for default boot, or <Esc> for boot prompt... 2 1 timed out boot: hda1:/vmlinuz root=/dev/hda1 console=tty0 console=ttyS0,115200 usr=flash IDE time out reset failed, but we may be on SATA Drive 0 does not exist boot: hda1:/vmlinuz root=/dev/hda1 console=tty0 console=ttyS0,115200 usr=flash
Mats Erik Andersson wrote:
Hello all,
I could use some of your insight into the elfboot stage. Appended to this message are two printouts of the elfboot process for 128MB RAM (which comes to a halt), and for 64MB RAM (which succeeds). It is jmp_to_elf_entry() in
src/arch/i386/boot/boot.c
that is unable to complete the set task for 128MB RAM. I suspect that the problem lies in the value of 'bounce_buffer' that is determined in src/boot/elfboot.c, but I fail to identify why a large value is causing any problem at all. The informational "Candidate bounce_buffer" is my addition at the loop end in get_bounce_buffer().
Since FILO never starts with 128MB, I cannot verify that the linuxbios_table indeed gets the expected last entry
Are you sure your second 64M are working correctly?
This could well be a ram init problem.
Which board?
Hello Stefan, and all interested people,
Stefan Reinauer stepan@coresystems.de skrev
Mats Erik Andersson wrote:
Hello all,
I could use some of your insight into the elfboot stage. Appended to this message are two printouts of the elfboot process for 128MB RAM (which comes to a halt), and for 64MB RAM (which succeeds). It is jmp_to_elf_entry() in
src/arch/i386/boot/boot.c
that is unable to complete the set task for 128MB RAM. I suspect that the problem lies in the value of 'bounce_buffer' that is determined in src/boot/elfboot.c, but I fail to identify why a large value is causing any problem at all. The informational "Candidate bounce_buffer" is my addition at the loop end in get_bounce_buffer().
Since FILO never starts with 128MB, I cannot verify that the linuxbios_table indeed gets the expected last entry
Are you sure your second 64M are working correctly?
This could well be a ram init problem.
Which board?
It is a port to msi/ms6147 of the code Uwe Hammer developed for msi/ms6119. Essentially, I have so far made two contributions:
1) the generic /src/sdram/generic_dump_spd.c, has been tailored and moved to the mainboard source directory,
2) an almost complete spd-detection mechanism for sdram has been incorporated into raminit.c.
Presently, I
a) boot successfully with a single sided 64MB card in either DIMM-slot. "Success" meaning that I can get a Debian Sarge router to run with /usr on an IDE-CF disk, and the rest loaded into a ramdisk.
b) The tree cases of double sided 128MB card in either slot, or two single sided 64MB cards in both slots, are correctly detected and the RAM memory is verified in 0x07ffff00 to 0x07fffff0 in all these cases. However, the execution halts after the jump into filo.elf. I was got an error message from malloc() inside Filo!
c) When populating with two double sided 32MB cards, already the jump into Coreboot is unsuccessful after attempted RAM initialisation. This probably depends on the timing parameters, that I have not dynamically implemented as of yet.
Thus, I am now trying to understand why a large amount of memory, i.e., 128MB instead of a mere 64MB, can prevent elfboot to succeed, even though the memory is undisputably well initialised.
I will report in due time. Best regards,
Mats E Andersson
Thus, I am now trying to understand why a large amount of memory, i.e., 128MB instead of a mere 64MB, can prevent elfboot to succeed, even though the memory is undisputably well initialised.
Hmm, I still have a feeling your memory is not initializing correctly. Have you tried ram_check() (from auto.c) on various chucks of memory? I would try..
ram_check(0, 640 * 1024); //first 64k (begining of first side) ram_check(64512 * 1024, 65536 * 1024); //63 - 64MB (end of first side) ram_check(65536 * 1024, 66560 * 1024); //64 - 65MB (begining of second side) ram_check(130048 * 1024, 131072 * 1024); //127 - 128MB (end of second side)
You may have to do these seperatly, with each build. Post back the results. Hope that helps.
Joseph Smith joe@settoplinux.org skrev så vänligt:
Thus, I am now trying to understand why a large amount of memory, i.e., 128MB instead of a mere 64MB, can prevent elfboot to succeed, even though the memory is undisputably well initialised.
Hmm, I still have a feeling your memory is not initializing correctly. Have you tried ram_check() (from auto.c) on various chucks of memory? I would try..
I did use these ram checks (last line was active for my last message):
/* DOS-area */ //ram_check(0, 640 * 1024); /* 1MB to 4MB */ //ram_check(0x00100000, 0x00400000); /* Across 64MB boundary */ //ram_check(0x03fff000, 0x04000010); /* Just below 128MB */ ram_check(0x07ffff00, 0x07fffff0);
Now that I have prepared accurate SPD-dumps for seven SDRAM cards, I will return on Monday to perform more extended verifications. Possibly large memory chunks could tell me more than strategic sampling does. The bad thing is that the code space prevents more than one verification range in each build, so the pace is indeed slow.
Best regards
Mats E Andersson
On Sat, 06 Sep 2008 14:19:37 +0200, Mats Erik Andersson mats.andersson@gisladisker.se wrote:
Joseph Smith joe@settoplinux.org skrev s vnligt:
Thus, I am now trying to understand why a large amount of memory, i.e., 128MB instead of a mere 64MB, can prevent elfboot to succeed, even though the memory is undisputably well initialised.
Hmm, I still have a feeling your memory is not initializing correctly.
Have
you tried ram_check() (from auto.c) on various chucks of memory? I would try..
I did use these ram checks (last line was active for my last message):
/* DOS-area */ //ram_check(0, 640 * 1024); /* 1MB to 4MB */ //ram_check(0x00100000, 0x00400000); /* Across 64MB boundary */ //ram_check(0x03fff000, 0x04000010); /* Just below 128MB */ ram_check(0x07ffff00, 0x07fffff0);
Now that I have prepared accurate SPD-dumps for seven SDRAM cards, I will return on Monday to perform more extended verifications. Possibly large memory chunks could tell me more than strategic sampling does. The bad thing is that the code space prevents more than one verification range in each build, so the pace is indeed slow.
Another thought...have you tried booting to memtest? If you can, memtest will absolutly be able to tell you what is going on here.
Hello Mats and others,
Like you, I am trying to get Coreboot-v2 up and running on a 440BX-based board (Abit AB-BM6), with more than the currently hardcoded 64 MB of RAM. I think I'm also running into the same problem.
My current setup involves 384 MB of RAM, with the appropriate changes to the hardcoded values in raminit.c (I haven't tried auto-detecting them yet, but I'm curious for your patch), and uses Coreinfo as the payload for now.
So long as I didn't change the values hardcoding the 64 MB limit, I could boot fine into the payload, even with all three DIMMs present. With the changes in raminit.c, elfboot/Coreinfo stops with an exception 6 (illegal opcode), as per the log below.
The curious thing is though, that when I first boot Linux with the original BIOS, flash the chip to Coreboot, and then do a warm reboot into my freshly burned Coreboot, it does succeed. Because of this, I'm guessing there's something that Coreboot neglects to initialize, but luckily also doesn't reset.
I've compared the dumps from "Northbridge following SDRAM init:" for the cold boot and the warm reboot from Linux, and there are indeed some differences. My prime suspect for now is the Memory Buffer Strength Control Register (0x69-0x6e), but I haven't had the chance to test that yet, and am not certain by a long shot that it's it. I'm including the dumps below, in case you can use them to better understand the problem with your own board.
With a bit of luck I can test my theory this evening, I'll keep you posted.
Kind regards, Tim.
*** Coreinfo crash:
elfboot: Attempting to load payload. rom_stream: 0xfffe0000 - 0xfffeffff Found ELF candidate at offset 0 header_offset is 0 Try to load at offset 0x0 malloc Enter, size 32, free_mem_ptr 00024ce4 malloc 0x00024ce4 New segment addr 0x100000 size 0x36310 offset 0x1000 filesize 0x9460 (cleaned up) New segment addr 0x100000 size 0x36310 offset 0x1000 filesize 0x9460 lb: [0x0000000000004000, 0x0000000000028000) Dropping non PT_LOAD segment Dropping non PT_LOAD segment Loading Segment: addr: 0x0000000000100000 memsz: 0x0000000000036310 filesz: 0x0000000000009460 [ 0x0000000000100000, 0000000000109460, 0x0000000000136310) <- 0000000000001000 Clearing Segment: addr: 0x0000000000109460 memsz: 0x000000000002ceb0 Loaded segments verified segments closed down stream Jumping to boot code at 0x101990 entry = 0x00101990 lb_start = 0x00004000 lb_size = 0x00024000 adjust = 0x17fd8000 buffer = 0x17fb8000 elf_boot_notes = 0x0001ed58 adjusted_boot_notes = 0x17ff6d58 Unexpected Exception: 6 @ 10:a54ea13a - Halting Code: 0 eflags: 00010006 eax: 00023eb4 ebx: 17fd8000 ecx: 00000000 edx: 00024000 edi: 18000000 esi: 00028000 ebp: 00023ebc esp: 17ffbe66
*** Warm reboot from original BIOS and Linux (seems to work):
Northbridge following SDRAM init: PCI: 00:00.00 00: 86 80 90 71 06 00 10 22 03 00 00 06 00 40 00 00 10: 08 00 00 e8 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 0c a0 00 ff 00 00 00 09 03 30 33 33 33 33 33 33 60: 08 10 18 20 30 30 30 30 00 03 28 f0 03 ca 00 00 70: 20 1f 0a 78 55 02 03 00 27 ff 10 38 00 00 00 00 80: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 98 88 00 00 04 61 00 00 00 05 00 00 00 00 00 00 a0: 02 00 10 00 03 02 00 1f 00 00 00 00 00 00 00 00 b0: 80 20 00 00 30 00 00 00 00 00 4d 17 20 10 00 00 c0: 00 00 00 00 00 00 00 00 18 0c 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 0c 00 00 00 00 00 00 00 e0: 4c ad ff bb 8a 3e 00 80 2c d3 f7 cf 9d 3e 00 00 f0: 40 01 00 00 00 f8 00 60 20 0f 00 00 00 00 00 00
*** cold reboot with Coreboot (crashes):
Northbridge following SDRAM init: PCI: 00:00.00 00: 86 80 90 71 06 00 10 22 03 00 00 06 00 40 00 00 10: 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 0c a0 00 ff 00 00 00 09 03 30 33 33 33 33 33 33 60: 08 10 18 20 30 30 30 30 00 03 00 00 00 00 00 00 70: 00 1f 02 38 55 02 03 00 27 ff 10 38 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 80 00 00 00 04 61 00 00 00 05 00 00 00 00 00 00 a0: 02 00 10 00 03 02 00 1f 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 18 0c 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 0c 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 f8 00 00 20 0f 00 00 00 00 00 00
On Wed, Sep 10, 2008 at 8:56 AM, Tim ter Laak timl@scintilla.utwente.nl wrote:
Hello Mats and others,
Like you, I am trying to get Coreboot-v2 up and running on a 440BX-based board (Abit AB-BM6), with more than the currently hardcoded 64 MB of RAM. I think I'm also running into the same problem.
My current setup involves 384 MB of RAM, with the appropriate changes to the hardcoded values in raminit.c (I haven't tried auto-detecting them yet, but I'm curious for your patch), and uses Coreinfo as the payload for now.
So long as I didn't change the values hardcoding the 64 MB limit, I could boot fine into the payload, even with all three DIMMs present. With the changes in raminit.c, elfboot/Coreinfo stops with an exception 6 (illegal opcode), as per the log below.
The curious thing is though, that when I first boot Linux with the original BIOS, flash the chip to Coreboot, and then do a warm reboot into my freshly burned Coreboot, it does succeed. Because of this, I'm guessing there's something that Coreboot neglects to initialize, but luckily also doesn't reset.
I've compared the dumps from "Northbridge following SDRAM init:" for the cold boot and the warm reboot from Linux, and there are indeed some differences. My prime suspect for now is the Memory Buffer Strength Control Register (0x69-0x6e), but I haven't had the chance to test that yet, and am not certain by a long shot that it's it. I'm including the dumps below, in case you can use them to better understand the problem with your own board.
With a bit of luck I can test my theory this evening, I'll keep you posted.
Kind regards, Tim.
Can you also include lspci -xxx -s 0:0.0 from the stock bios?
Thanks, Corey
On Wed, 10 Sep 2008, Corey Osgood wrote:
Can you also include lspci -xxx -s 0:0.0 from the stock bios?
Thanks, Corey
The dump below is certainly from my Abit AB-BM6 system, and >90% probable with the same DIMMs as I'm currently using. I'll double check tonight when I have access to the system again.
Thanks, Tim.
00:00.0 Host bridge [0600]: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge [8086:7190] (rev 03) 00: 86 80 90 71 06 00 10 a2 03 00 00 06 00 40 00 00 10: 08 00 00 e8 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 0c aa 00 ef 00 00 00 09 03 10 11 11 00 00 00 00 60: 08 10 18 20 30 30 30 30 00 2f 28 f0 03 ca 00 00 70: 20 1f 0a 78 55 02 03 01 27 ff 10 38 00 00 00 00 80: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 98 88 00 00 04 61 00 00 00 05 00 00 00 00 00 00 a0: 02 00 10 00 03 02 00 1f 00 00 00 00 00 00 00 00 b0: 80 20 00 00 30 00 00 00 00 00 4d 17 20 10 00 00 c0: 00 00 00 00 00 00 00 00 18 0c 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 0c 00 00 00 00 00 00 00 e0: 4c ad ff bb 8a 3e 00 80 2c d3 f7 cf 9d 3e 00 00 f0: 40 01 00 00 00 f8 00 60 20 0f 00 00 00 00 00 00
On Wed, 10 Sep 2008, Tim ter Laak wrote:
On Wed, 10 Sep 2008, Corey Osgood wrote:
Can you also include lspci -xxx -s 0:0.0 from the stock bios?
Thanks, Corey
The dump below is certainly from my Abit AB-BM6 system, and >90% probable with the same DIMMs as I'm currently using. I'll double check tonight when I have access to the system again.
That dump was indeed the right one.
Unfortunately, setting the MBSC Registers to values that work in the warm reboot doesn't get me any further. I have noticed however that I don't always get exception 6, but also 13 (general protection error), or just a hang with no exception message. I'll continue to look further.
Tim.
Dear Tim ter Laak,
I certainly appreciate your company in out quest for knowledge!
So long as I didn't change the values hardcoding the 64 MB limit, I could boot fine into the payload, even with all three DIMMs present. With the changes in raminit.c, elfboot/Coreinfo stops with an exception 6 (illegal opcode), as per the log below.
Here the higher memory devices were simply ignored. Thus no interference.
differences. My prime suspect for now is the Memory Buffer Strength Control Register (0x69-0x6e), but I haven't had the chance to test that yet, and am not certain by a long shot that it's it. I'm including the dumps below, in case you can use them to better understand the problem with your own board.
Unfortunately, setting the MBSC Registers to values that work in the warm reboot doesn't get me any further. I have noticed however that I don't always get exception 6, but also 13 (general protection error), or just a hang with no exception message. I'll continue to look further.
My logging reveals results identical to yours, and there is either a blind halt, exceptions 6 or 13, and as of lately a reproducable exception -1.
At present I am experimenting with these settings:
pci_write_config16(ctrl->d0, RPS, 0x0055); // Always 4kB page size. pci_write_config16(ctrl->d0, SDRAMC, 0x0103); /* Only devices with two banks, until tested to be four banks. */ data = 0x00; if ( spd_read_byte(ctrl->channel0[0], 17) == 0x04 ) data |= 0x03; if ( spd_read_byte(ctrl->channel0[1], 17) == 0x04 ) data |= (0x03 << 2); /* The value (data << 8) produces correct bank flags for PGPOL. */ pci_write_config16(ctrl->d0, PGPOL, 0x07 | (data << 8)); pci_write_config32(ctrl->d0, MBSC, 0x0003c003);
Apart from the dynamic bank/row detection, these are the only settings that I am manipulating at present.
One remarkable observation is that with these and similar settings, two double-sided 128MB cards with identical SPD-data 0x00 to 0x3d produce disparate results: one of them verifies RAM from 1MB to 128MB--, whereas the other verifies only 1MB to 64MB--. Explanation anyone? They support latency CL3 and CL2. Another 128MB double-sided card which only supports CL3, is still verifying all 128MB of RAM.
I suspect that the burst length must be understood better, as well as the strobe signals, but I find my present references to be insufficient at the moment.
Best regards for now,
Mats E Andersson
Hello Tim,
I copy your original text to the mailing list, in order that you get proper credit for the solution I found, based on your comment.
Hello Mats,
Regarding the Coreboot issue, I was wondering whether you already had a look at the do_ram_command() function in src/northbridge/intel/i440bx/raminit.c?
Frankly, the comment there saying "TODO: Support for multiple DIMMs." raises an alarm here. A cursory look at the current implementation suggests to me that the command has to be sent to each separate DIMM module, probably even each "side" for double-sided DIMMs. At a guess, I think the read32() has to be done to all of the starting addresses for each side, i.e. the addresses corresponding to the values stored in the DRB registers 0x60-0x67.
It is unlikely I can start testing this before saturday, so I'd like to know what you think of it and possibly save you some more of this goose chase.
Kind regards, Tim.
Yes, I have shared your irritation concerning that remark in do_ram_command(), but I thought the read32-statement to be harmless without deeper sense.
However, I just found that it is instrumental. Starting from one of the code structures that I mentioned in my previous posting, I made sure that the end of do_ram_command() contains
/* Read from (DIMM start address + addr_offset). */ read32(0 + addr_offset); // FIXME + read32(0x04000000 + addr_offset);
The third and last line is the new addition, and it reflects the fact that I wanted to perform a test with a double sided 128MB DIMM card. Thus there is a read from both banks/rows after each ram-command.
The outcome is that Coreboot indeed is able to leave the elfboot stage, enter Filo, and finally getting my Debian router going. For the sake of testing I made sure the router ran its complete file system in RAM, which means slightly over 70MB usage. It was a success!
As a next test run I used two double sided 128MB SDRAM cards. As expected the RAM memory verified on 1M -- 128MB--, Coreboot reported 256MB available, but elfboot got stuck before entering Filo. This fits very well with the idea that somehow (seemingly magically) the northbridge does not acknowledge memory banks 2 and 3 as alive, it only has access to banks 0 and 1. It would thus be necessary to incorporate also
+ read32(0x08000000 + addr_offset); + read32(0x0c000000 + addr_offset);
at the end of do_ram_command(). Case closed!
I will now take time (a few days) to beautify my changes to the file src/northbridge/intel/i440bx/raminit.c, in order that I may submit it to Coreboot. The lesson to be learned here, is that Uwe Hammer did good work in the original implementation, but that naked TODO/FIXME are not enough to pinpoint the problems or shortcomings of temporary code chunks. Any fellow developer is better off with some explicit remark that helps to indicate some presumption behind a particular piece of code.
Many thanks for your help Tim! Best regards for now,
Mats E Andersson
On Thu, 11 Sep 2008 15:53:30 +0200, Mats Erik Andersson mats.andersson@gisladisker.se wrote:
Hello Tim,
I copy your original text to the mailing list, in order that you get proper credit for the solution I found, based on your comment.
Hello Mats,
Regarding the Coreboot issue, I was wondering whether you already had a look at the do_ram_command() function in src/northbridge/intel/i440bx/raminit.c?
Frankly, the comment there saying "TODO: Support for multiple DIMMs." raises an alarm here. A cursory look at the current implementation suggests to me that the command has to be sent to each separate DIMM module, probably even each "side" for double-sided DIMMs. At a guess, I think the read32() has to be done to all of the starting addresses for each side, i.e. the addresses corresponding to the values stored in the DRB registers 0x60-0x67.
It is unlikely I can start testing this before saturday, so I'd like to know what you think of it and possibly save you some more of this goose chase.
Kind regards, Tim.
Yes, I have shared your irritation concerning that remark in do_ram_command(), but I thought the read32-statement to be harmless without deeper sense.
However, I just found that it is instrumental. Starting from one of the code structures that I mentioned in my previous posting, I made sure that the end of do_ram_command() contains
/* Read from (DIMM start address + addr_offset). */ read32(0 + addr_offset); // FIXME
- read32(0x04000000 + addr_offset);
The third and last line is the new addition, and it reflects the fact that I wanted to perform a test with a double sided 128MB DIMM card. Thus there is a read from both banks/rows after each ram-command.
The outcome is that Coreboot indeed is able to leave the elfboot stage, enter Filo, and finally getting my Debian router going. For the sake of testing I made sure the router ran its complete file system in RAM, which means slightly over 70MB usage. It was a success!
As a next test run I used two double sided 128MB SDRAM cards. As expected the RAM memory verified on 1M -- 128MB--, Coreboot reported 256MB available, but elfboot got stuck before entering Filo. This fits very well with the idea that somehow (seemingly magically) the northbridge does not acknowledge memory banks 2 and 3 as alive, it only has access to banks 0 and 1. It would thus be necessary to incorporate also
- read32(0x08000000 + addr_offset);
- read32(0x0c000000 + addr_offset);
at the end of do_ram_command(). Case closed!
That's what I thought. That read32() is what initializes each side/bank/row of memory. I was having a simular problem with the i830 and this is what I came up with:
/* Send the ram command to each row of memory. * (DIMM_SOCKETS * 2) is the maximum number of rows possible. * Note: Each DRB defines the upper boundary address of * each SDRAM row in 32-MB granularity. */ dimm_start = 0;
for (i = 0; i < (DIMM_SOCKETS * 2); i++) { dimm_end = pci_read_config8(ctrl->d0, DRB + i); if (dimm_end > dimm_start) { PRINT_DEBUG(" Sending RAM command 0x"); PRINT_DEBUG_HEX32(reg32); PRINT_DEBUG(" to 0x"); PRINT_DEBUG_HEX32((dimm_start * 32 * 1024 * 1024) + addr_offset); PRINT_DEBUG("\r\n"); read32((dimm_start * 32 * 1024 * 1024) + addr_offset); } /* Set the start of the next DIMM. */ dimm_start = dimm_end; }
And it works beautifully, maybe you would want to impliment something like this? Of course you would have to make some minor adjustments.