Below is the end of the output. I have no idea what coreboot_ram.map should look like, so I've attached that file.
before scan bus of PCI: 00:18.0 In scan bus of PCI: 00:18.0 scan_bus() = 0010131c amdk8_scan_chains: PCI: 00:18.0, node 0 amdk8_scan_chains: PCI: 00:18.0, node 0, link 0 amdk8_scan_chain: PCI: 00:18.0, node 0, link 0 connected Init Complete non coherent scan chain
On Wed, Oct 21, 2009 at 11:37 AM, Hugh Greenberg hng@lanl.gov wrote:
Below is the end of the output. I have no idea what coreboot_ram.map should look like, so I've attached that file.
Thanks. That was just in case amdk8_scan_chains wasn't being called, so we could see why. 0010131c matches the value for amdk8_scan_chains, and it got called, so we're good.
before scan bus of PCI: 00:18.0 In scan bus of PCI: 00:18.0 scan_bus() = 0010131c amdk8_scan_chains: PCI: 00:18.0, node 0 amdk8_scan_chains: PCI: 00:18.0, node 0, link 0 amdk8_scan_chain: PCI: 00:18.0, node 0, link 0 connected Init Complete non coherent scan chain
Here's the next step. I took out a few of the print statements that we got past.
Thanks, Myles
Here is the new output:
before scan bus of PCI: 00:18.0 amdk8_scan_chains: PCI: 00:18.0, node 0 amdk8_scan_chains: PCI: 00:18.0, node 0, link 0 amdk8_scan_chain: PCI: 00:18.0, node 0, link 0 connected Init Complete non coherent scan chain ht_collapse_early_enumeration: PCI: 00:18.0 children PCI: 00:18.0 offset_unitid0 ht_collapse_early_enumeration: ctrl = 20
On Wed, Oct 21, 2009 at 12:23 PM, Hugh Greenberg hng@lanl.gov wrote:
Here is the new output:
before scan bus of PCI: 00:18.0 amdk8_scan_chains: PCI: 00:18.0, node 0 amdk8_scan_chains: PCI: 00:18.0, node 0, link 0 amdk8_scan_chain: PCI: 00:18.0, node 0, link 0 connected Init Complete non coherent scan chain ht_collapse_early_enumeration: PCI: 00:18.0 children PCI: 00:18.0 offset_unitid0 ht_collapse_early_enumeration: ctrl = 20
And here's the next.
Thanks, Myles
Here is the output:
before scan bus of PCI: 00:18.0 amdk8_scan_chains: PCI: 00:18.0, node 0 amdk8_scan_chains: PCI: 00:18.0, node 0, link 0 amdk8_scan_chain: PCI: 00:18.0, node 0, link 0 connected Init Complete non coherent scan chain ht_collapse_early_enumeration: PCI: 00:18.0 children PCI: 00:18.0 offset_unitid0 ht_collapse_early_enumeration: ctrl = 20 Check collapse state Check collapse state pci_read pci_remember_direct: pci_bus_fallback_ops=00000000 pci_check_direct
On Wed, Oct 21, 2009 at 2:21 PM, Hugh Greenberg hng@lanl.gov wrote:
Here is the output:
before scan bus of PCI: 00:18.0 amdk8_scan_chains: PCI: 00:18.0, node 0 amdk8_scan_chains: PCI: 00:18.0, node 0, link 0 amdk8_scan_chain: PCI: 00:18.0, node 0, link 0 connected Init Complete non coherent scan chain ht_collapse_early_enumeration: PCI: 00:18.0 children PCI: 00:18.0 offset_unitid0 ht_collapse_early_enumeration: ctrl = 20 Check collapse state Check collapse state pci_read pci_remember_direct: pci_bus_fallback_ops=00000000 pci_check_direct
It looks like it's dying in pci_check_direct. Hopefully we're getting close. Here's another patch.
Thanks, Myles
Alight! It booted and loaded seabios. Attached is the entire output. One thing though, it failed to load gpxe. The only thing related to this that looks like an error is:
File pci14e4,16a6.rom is of type 63000000 instead oftype 30 and this PCI Expansion ROM, signature 0x0000, INIT size 0x0000, data ptr 0x0000 Incorrect Expansion ROM Header Signature 0000
I got this file from gpxe's rom O matic. I tried getting a new one just in case and got the same error.
On Wed, Oct 21, 2009 at 3:03 PM, Hugh Greenberg hng@lanl.gov wrote:
Alight! It booted and loaded seabios. Attached is the entire output.
Great. So the two problems were init_timer and pci_check_direct, right?
I've attached the two patches.
Signed-off-by: Myles Watson mylesgw@gmail.com
One thing though, it failed to load gpxe. The only thing related to this that looks like an error is:
File pci14e4,16a6.rom is of type 63000000 instead oftype 30 and this PCI Expansion ROM, signature 0x0000, INIT size 0x0000, data ptr 0x0000 Incorrect Expansion ROM Header Signature 0000
This should be OK because you don't want it to run before Coreboot is through. I think SeaBIOS should run it, right?
I don't know anything about gpxe. I'm assuming you followed the instructions here: http://www.coreboot.org/SeaBIOS#Adding_gpxe_support
Did you try pushing f12 to see the boot menu?
Thanks, Myles
Yeah, those two patches solved the problem. Thank you very much for your help with this. I followed those directions, but still no luck. I'll keep working at it.
While I was just trying to get seabios to boot gpxe, coreboot hung at the same spot. It seems to happen randomly now, not every time it boots like before.
While I was just trying to get seabios to boot gpxe, coreboot hung at the same spot. It seems to happen randomly now, not every time it boots like before.
Too bad.
I guess put all the debugging back in and see if it still hangs randomly, and if it is exactly the same spot.
Thanks, Myles
I'm having a hard time reproducing it now. I'll post something when/if I have the output with the debug info.
It froze again. Here is some of the output:
SMBus controller enabled Ram1.00 setting up CPU00 northbridge registers done. Ram1.01 setting up CPU01 northbridge registers done. Ram2.00 Enabling dual channel memory Registered 166Mhz RAM end at 0x00100000 kB Lower RAM end at 0x00100000 kB Ram2.01 Enabling dual channel memory Registered 166Mhz RAM end at 0x00200000 kB Lower RAM end at 0x00200000 kB Ram3 Before starting clocks: Before memreset: cpu is pre_c0 after first udelay
RAM end at 0x00200000 kB Lower RAM end at 0x00200000 kB Ram3 Before starting clocks: Before memreset: cpu is pre_c0 after first udelay
OK. So the timer worked for the first udelay...
Does it only freeze when you have both CPUs enabled? Have you tried it with the no_smp patch again? I'm grasping at straws.
Thanks, Myles
On Thu, Oct 22, 2009 at 09:16:14AM -0600, Myles Watson wrote:
RAM end at 0x00200000 kB Lower RAM end at 0x00200000 kB Ram3 Before starting clocks: Before memreset: cpu is pre_c0 after first udelay
OK. So the timer worked for the first udelay...
Does it only freeze when you have both CPUs enabled? Have you tried it with the no_smp patch again? I'm grasping at straws.
This is starting to sound like all the weirdness I was seeing when working on the h8dmr fam10 port a few months ago.
Are you sure it hangs? I thought so at first as well, but it turned out that things were running extremely slowly when compiling with gcc 4.3 (32 bit). If I waited 5 minutes or so eventually the board would boot.
Can you reproduce a hang when changing CONSOLE_LOGLEVEL ? In my case the board would just hang if I lowered the default loglevel to something less than 8.
I never did figure out what was going on there. Ron thought perhaps there was a cache issue. I put a file in the tree with the issues I ran into
src/mainboard/supermicro/h8dmr_fam10/README
I haven't been able to revisit yet as that particular box is in production.
What toolchain are you using?
Thanks, Ward.
I was using gcc 4.3. Without the patches, I still see a problem even with gcc 3.4. With the patches and gcc 3.4, things seem to be better so far. Thanks.
Hugh Greenberg wrote:
File pci14e4,16a6.rom is of type 63000000 instead oftype 30 and this PCI Expansion ROM, signature 0x0000, INIT size 0x0000, data ptr 0x0000 Incorrect Expansion ROM Header Signature 0000
I got this file from gpxe's rom O matic. I tried getting a new one just in case and got the same error.
IIRC this is a known problem (by us) but there hasn't been much effort in getting it fixed with GPXE. There is a script in the SeaBIOS source tree which can fix up checksums, maybe you can try using that - and if it works send a hint to GPXE?
//Peter
The script didn't help. Seabios just doesn't see the rom. Is this a seabios problem? I didn't have this problem for the tyan s2881. Here is the output again:
PCI Expansion ROM, signature 0x0000, INIT size 0x0000, data ptr 0x0000 Incorrect Expansion ROM Header Signature 0000 Devices initialized Show all devs...After init. Root Device: enabled 1, 0 resources APIC_CLUSTER: 0: enabled 1, 0 resources APIC: 00: enabled 1, 0 resources PCI_DOMAIN: 0000: enabled 1, 5 resources PCI: 00:18.0: enabled 1, 4 resources PCI: 01:01.0: enabled 1, 3 resources PCI: 01:01.1: enabled 1, 1 resources PCI: 01:02.0: enabled 0, 0 resources PCI: 01:02.1: enabled 1, 1 resources PCI: 01:03.0: enabled 1, 3 resources PCI: 04:00.0: enabled 1, 1 resources PCI: 04:00.1: enabled 1, 1 resources PCI: 04:00.2: enabled 0, 0 resources PCI: 04:01.0: enabled 0, 0 resources PCI: 04:06.0: enabled 1, 4 resources PCI: 01:04.0: enabled 1, 3 resources PNP: 002e.0: enabled 0, 3 resources PNP: 002e.1: enabled 0, 3 resources PNP: 002e.2: enabled 0, 4 resources PNP: 002e.3: enabled 1, 2 resources PNP: 002e.4: enabled 0, 2 resources PNP: 002e.5: enabled 0, 1 resources PNP: 002e.6: enabled 1, 3 resources PNP: 002e.7: enabled 0, 2 resources PNP: 002e.8: enabled 0, 2 resources PNP: 002e.9: enabled 0, 2 resources PNP: 002e.a: enabled 0, 2 resources PCI: 01:04.1: enabled 1, 1 resources PCI: 01:04.2: enabled 1, 1 resources PCI: 01:04.3: enabled 1, 1 resources I2C: 01:70: enabled 1, 0 resources I2C: 00:2c: enabled 1, 0 resources I2C: 01:50: enabled 1, 0 resources I2C: 01:51: enabled 1, 0 resources I2C: 01:52: enabled 1, 0 resources I2C: 01:53: enabled 1, 0 resources I2C: 01:54: enabled 1, 0 resources I2C: 01:55: enabled 1, 0 resources I2C: 01:56: enabled 1, 0 resources I2C: 01:57: enabled 1, 0 resources PCI: 01:04.5: enabled 0, 0 resources PCI: 01:04.6: enabled 1, 2 resources PCI: 00:18.1: enabled 1, 0 resources PCI: 00:18.2: enabled 1, 0 resources PCI: 00:18.3: enabled 1, 1 resources PCI: 00:19.0: enabled 1, 0 resources PCI: 00:19.1: enabled 1, 0 resources PCI: 00:19.2: enabled 1, 0 resources PCI: 00:19.3: enabled 1, 0 resources APIC: 01: enabled 1, 0 resources PCI: 02:03.0: enabled 1, 2 resources PCI: 02:04.0: enabled 1, 2 resources Initializing CBMEM area to 0x3fff0000 (65536 bytes) Adding CBMEM entry as no. 1 Moving GDT to 3fff0200...ok High Tables Base is 3fff0000. Copying Interrupt Routing Table to 0x000f0000... done. Adding CBMEM entry as no. 2 Copying Interrupt Routing Table to 0x3fff0400... done. PIRQ table: 176 bytes. Looking for bad PCIX MHz input Looking for bad Hot Swap Enable OK 133MHz & Hot Swap is off Wrote the mp table end at: 000f0410 - 000f0614 Adding CBMEM entry as no. 3 Looking for bad PCIX MHz input Looking for bad Hot Swap Enable OK 133MHz & Hot Swap is off Wrote the mp table end at: 3fff1410 - 3fff1614 MP table: 532 bytes. Adding CBMEM entry as no. 4 Writing high table forward entry at 0x00000500 Wrote coreboot table at: 00000500 - 00000518 checksum 9bdf New low_table_end: 0x00000518 Now going to write high coreboot table at 0x3fff2400 rom_table_end = 0x3fff2400 Adjust low_table_end from 0x00000518 to 0x00001000 Adjust rom_table_end from 0x3fff2400 to 0x40000000 Adding high table area Wrote coreboot table at: 3fff2400 - 3fff2ca4 checksum b5b9 coreboot table: 2212 bytes. 0. FREE SPACE 3fff4400 0000bc00 1. GDT 3fff0200 00000200 2. IRQ TABLE 3fff0400 00001000 3. SMP TABLE 3fff1400 00001000 4. COREBOOT 3fff2400 00002000 Check CBFS header at fffeffe0 magic is 4f524243 Found CBFS header at fffeffe0 Check fallback/coreboot_ram CBFS: follow chain: fff80000 + 38 + a590 + align -> fff8a600 Check fallback/payload Got a payload Loading segment from rom address 0xfff8a638 data (compression=1) malloc Enter, size 36, free_mem_ptr 00160ce4 malloc 00160ce4 New segment dstaddr 0xf0000 memsize 0x10000 srcaddr 0xfff8a670 filesize 0x7350 (cleaned up) New segment addr 0xf0000 size 0x10000 offset 0xfff8a670 filesize0 Loading segment from rom address 0xfff8a654 Entry Point 0x000fc4f3 Loading Segment: addr: 0x00000000000f0000 memsz: 0x0000000000010000 filesz: 0x00 lb: [0x0000000000100000, 0x0000000000164000) Post relocation: addr: 0x00000000000f0000 memsz: 0x0000000000010000 filesz: 0x00 using LZMA [ 0x00000000000f0000, 0000000000100000, 0x0000000000100000) <- 00000000fff8a670 dest f0000, end 100000, bouncebuffer 7ff38000 Loaded segments Jumping to boot code at fc4f3 entry = 0x000fc4f3 lb_start = 0x00100000 lb_size = 0x00064000 adjust = 0x7fe9c000 buffer = 0x7ff38000 elf_boot_notes = 0x00113a20 adjusted_boot_notes = 0x7ffafa20 Start bios (version 0.4.2-20090908_202836-morn.localdomain) CPU Mhz=1804 Found mainboard Arima HDAMA Found CBFS header at 0xfffeffe0 Ram Size=0x80000000 Found 2 cpu(s) Scan for VGA option rom Got ps2 nak (status=51); continuing ps2_recvbyte timeout Found 0 lpt ports Found 1 serial ports Copying PIR from 0x3fff0400 to 0x000fdc50 Copying MPTABLE from 0x3fff1400/3fff1410 to 0x000fda30 SMBIOS ptr=0x000fda10 table=0x7ffff800
Scan for option roms Press F12 for boot menu.
Returned 61440 bytes of ZoneHigh e820 map has 7 items: 0: 0000000000000000 - 000000000009f400 = 1 1: 000000000009f400 - 00000000000a0000 = 2 2: 00000000000f0000 - 0000000000100000 = 2 3: 0000000000100000 - 000000003fff0000 = 1 4: 000000003fff0000 - 0000000040000000 = 2 5: 0000000040000000 - 000000007ffff000 = 1 6: 000000007ffff000 - 0000000080000000 = 2 enter handle_19: NULL Booting from Floppy... fail handle_legacy_disk:845(1): a=00000201 b=00000000 c=00000001 d=00000000 ds=0000 es=07c0 ss=0000 si=00000000 di=00000000 bp=00000000 sp=00007b18 cs=f000 ip=e82d f=0002 Boot failed: could not read the boot disk
enter handle_18: NULL Booting from CD-Rom... Boot failed: Could not read from CDROM (code 0001) enter handle_18: NULL Booting from Hard Disk... fail handle_legacy_disk:845(1): a=00000201 b=00000000 c=00000001 d=00000080 ds=0000 es=07c0 ss=0000 si=00000000 di=00000000 bp=00000000 sp=00007b18 cs=f000 ip=e82d f=0002 Boot failed: could not read the boot disk
enter handle_18: NULL Booting from CBFS... enter handle_18: NULL No bootable device.
CTRL-A Z for help |115200 8N1 | NOR | Minicom 2.3 | VT102 | Offline
On Thu, Oct 29, 2009 at 12:21 PM, Hugh Greenberg hng@lanl.gov wrote:
The script didn't help. Seabios just doesn't see the rom. Is this a seabios problem? I didn't have this problem for the tyan s2881. Here is the output again:
Found mainboard Arima HDAMA Found CBFS header at 0xfffeffe0 Ram Size=0x80000000 Found 2 cpu(s) Scan for VGA option rom
Here's a snippet from mine:
Scan for VGA option rom Attempting to init PCI bdf 02:00.0 (dev/ven 014110de) Searching CBFS for prefix pci10de,0141.rom Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file ... Much later ... Scan for option roms Attempting to init PCI bdf 00:00.0 (dev/ven 005e10de) Searching CBFS for prefix pci10de,005e.rom Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file Attempting to map option rom on dev 00:00.0 Option rom sizing returned 0 0 Attempting to init PCI bdf 00:01.0 (dev/ven 005110de) Searching CBFS for prefix pci10de,0051.rom Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file
Is your debug level >= 6 for SeaBIOS? It looks like you do.
Do you have CONFIG_OPTION_ROMS_DEPLOYED set to 0?
Thanks, Myles
I was using the pre-built one mentioned in the wiki: http://www.coreboot.org/SeaBIOS . I don't know what the debug option is set to. This worked for me for the tyan s2881, so I assume that the correct options are set. I tried compiling my own and I had that option set along with the others mentioned in the wiki, but I also had the same problem. I can try compiling my own again and making sure the debug option is set to 6.
Here is the relevant output with the debug option set to 6:
Searching CBFS for prefix floppyimg/ Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file pci14e4,16a6.rom Found CBFS file Scan for option roms Attempting to init PCI bdf 00:18.0 (dev/ven 11001022) Searching CBFS for prefix pci1022,1100.rom Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file pci14e4,16a6.rom Found CBFS file Attempting to map option rom on dev 00:18.0 Option rom sizing returned 0 0 Attempting to init PCI bdf 00:18.1 (dev/ven 11011022) Searching CBFS for prefix pci1022,1101.rom Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file pci14e4,16a6.rom Found CBFS file Attempting to map option rom on dev 00:18.1 Option rom sizing returned 0 0 Attempting to init PCI bdf 00:18.2 (dev/ven 11021022) Searching CBFS for prefix pci1022,1102.rom Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file pci14e4,16a6.rom Found CBFS file Attempting to map option rom on dev 00:18.2 Option rom sizing returned 0 0 Attempting to init PCI bdf 00:18.3 (dev/ven 11031022) Searching CBFS for prefix pci1022,1103.rom Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file pci14e4,16a6.rom Found CBFS file Attempting to map option rom on dev 00:18.3 Option rom sizing returned 0 0 Attempting to init PCI bdf 00:19.0 (dev/ven 11001022) Searching CBFS for prefix pci1022,1100.rom Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file pci14e4,16a6.rom Found CBFS file Attempting to map option rom on dev 00:19.0 Option rom sizing returned 0 0 Attempting to init PCI bdf 00:19.1 (dev/ven 11011022) Searching CBFS for prefix pci1022,1101.rom Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file pci14e4,16a6.rom
Los Alamos National Laboratory, CCS-1 Email: hng@lanl.gov Phone: (505) 665-6471
Found CBFS file Attempting to map option rom on dev 00:19.1 Option rom sizing returned 0 0 Attempting to init PCI bdf 00:19.2 (dev/ven 11021022) Searching CBFS for prefix pci1022,1102.rom Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file pci14e4,16a6.rom Found CBFS file Attempting to map option rom on dev 00:19.2 Option rom sizing returned 0 0 Attempting to init PCI bdf 00:19.3 (dev/ven 11031022) Searching CBFS for prefix pci1022,1103.rom Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file pci14e4,16a6.rom Found CBFS file Attempting to map option rom on dev 00:19.3 Option rom sizing returned 0 0 Searching CBFS for prefix genroms/ Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file pci14e4,16a6.rom Found CBFS file Press F12 for boot menu.
finalize PMM malloc finalize zone 0: 00007c00-00090000 used=0 (0%) zone 1: 000a0000-000a0000 used=0 (0%) zone 2: 000fd440-000fdc40 used=752 (36%) zone 3: 40000000-7fff0000 used=128 (0%) zone 4: 7fff0000-80000000 used=2048 (3%) Returned 61440 bytes of ZoneHigh e820 map has 7 items: 0: 0000000000000000 - 000000000009f400 = 1 1: 000000000009f400 - 00000000000a0000 = 2 2: 00000000000f0000 - 0000000000100000 = 2 3: 0000000000100000 - 000000003fff0000 = 1 4: 000000003fff0000 - 0000000040000000 = 2 5: 0000000040000000 - 000000007ffff000 = 1 6: 000000007ffff000 - 0000000080000000 = 2 Jump to int19 enter handle_19: NULL Booting from Floppy... fail handle_legacy_disk:801(1): a=00000201 b=00000000 c=00000001 d=00000000 ds=0000 es=07c0 ss=0000 si=00000000 di=00000000 bp=00000000 sp=00007b00 cs=f000 ip=efc4 f=0202 Boot failed: could not read the boot disk
enter handle_18: NULL Booting from CD-Rom... Boot failed: Could not read from CDROM (code 0001) enter handle_18: NULL Booting from Hard Disk... fail handle_legacy_disk:801(1): a=00000201 b=00000000 c=00000001 d=00000080 ds=0000 es=07c0 ss=0000 si=00000000 di=00000000 bp=00000000 sp=00007b00 cs=f000 ip=efc4 f=0202 Boot failed: could not read the boot disk
enter handle_18: NULL Booting from CBFS... Searching CBFS for prefix img/ Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file pci14e4,16a6.rom Found CBFS file enter handle_18: NULL No bootable device.
It seems though that seabios does not detect my network cards. I tried then adding the rom as pci1022,1101.rom, and gpxe will load, but it won't detect my network devices. Any suggestions? Thanks.
On Thu, Oct 29, 2009 at 2:31 PM, Hugh Greenberg hng@lanl.gov wrote:
Here is the relevant output with the debug option set to 6:
Searching CBFS for prefix floppyimg/
Found CBFS file fallback/coreboot_ram Found CBFS file fallback/payload Found CBFS file pci14e4,16a6.rom Found CBFS file Scan for option roms Attempting to init PCI bdf 00:18.0 (dev/ven 11001022)
...
Attempting to init PCI bdf 00:18.1 (dev/ven 11011022)
...
Attempting to init PCI bdf 00:18.2 (dev/ven 11021022)
Attempting to init PCI bdf 00:18.3 (dev/ven 11031022)
Attempting to init PCI bdf 00:19.0 (dev/ven 11001022) Attempting to init PCI bdf 00:19.1 (dev/ven 11011022) Attempting to init PCI bdf 00:19.2 (dev/ven 11021022) Attempting to init PCI bdf 00:19.3 (dev/ven 11031022)
It's only seeing your CPUs. I was looking for Attempting to init... lines with other devices.
02:03.0 (from your Coreboot log this looks like one of your NICs)
Maybe you can cheat and set CONFIG_PCI_ROOT1 to 2. I don't know why it's not finding your devices.
Thanks, Myles
Would you try this patch?
I'm having problems where PCI config writes aren't working correctly, and I'm wondering if you are too.
Thanks, Myles
Attached is the output after applying this patch.
On Thu, 2009-10-29 at 16:55 -0600, Myles Watson wrote:
Would you try this patch?
I'm having problems where PCI config writes aren't working correctly, and I'm wondering if you are too.
Thanks, Myles
Attached is the output after applying this patch.
Thanks. Yours looks good, so I'll have to dig deeper to see why I've got those problems.
Myles
It's really strange that SeaBIOS can't find your devices. Could you try this patch? I'm wondering what your PCI routing table looks like. I revived some of Ron's debug code from v3.
Thanks, Myles
Attached is the output. I tried your idea with changing PCI_ROOT and it worked. You can see gpxe booting in the output.
On Fri, Oct 30, 2009 at 5:07 PM, Hugh Greenberg hng@lanl.gov wrote:
Attached is the output. I tried your idea with changing PCI_ROOT and it worked. You can see gpxe booting in the output.
CONFIG(e0)01-04 ->(0,0),R W (bus numbers)
That's not good. That means "If you get a config space read or write for bus 1-4, send it down link 0 of node 0.
Unfortunately, it left out bus 0, which is why you only see the CPUs. It also explains why you can't see bus 2 unless you specifically point it out to SeaBIOS.
I guess we need to look at the code that sets that. It worked fine for SimNOW...
Thanks, Myles
It turns out that SB_HT_CHAIN_ON_BUS_0 was set incorrectly. I should have had you test if it should have been 2, but at least setting it to 1 should let SeaBIOS find your devices.
Thanks, Myles
Myles,
Setting it to 1 or 2 gives what looks like the same output. It causes coreboot to fail with the following error:
Initializing CBMEM area to 0x3fff0000 (65536 bytes) Adding CBMEM entry as no. 1 Moving GDT to 3fff0200...ok High Tables Base is 3fff0000. Copying Interrupt Routing Table to 0x000f0000... done. Adding CBMEM entry as no. 2 Copying Interrupt Routing Table to 0x3fff0400... done. PIRQ table: 176 bytes. Looking for bad PCIX MHz input get_pbus: dev is NULL!
Setting it to 1 or 2 gives what looks like the same output. It causes coreboot to fail with the following error:
I guess I shouldn't have committed it until it worked, but 0 was the wrong value.
Initializing CBMEM area to 0x3fff0000 (65536 bytes) Adding CBMEM entry as no. 1 Moving GDT to 3fff0200...ok High Tables Base is 3fff0000. Copying Interrupt Routing Table to 0x000f0000... done. Adding CBMEM entry as no. 2 Copying Interrupt Routing Table to 0x3fff0400... done. PIRQ table: 176 bytes. Looking for bad PCIX MHz input
That message comes from mainboard/arima/hdama/mptable.c
The bus numbers are hard-coded. The easiest thing to do would be to:
1. Choose 1 or 2 for that config value 2. Find the bus and device numbers in the output 3. Change the hard coded values
Thanks, Myles
Myles,
I'm not sure what you mean by: "Change the hard coded values." Where should I change them?
Myles,
I'm not sure what you mean by: "Change the hard coded values." Where should I change them?
Send me the latest output and I'll send you a patch to test.
Thanks, Myles
Myles,
Attached is the latest output. Thanks.
Here's a patch that will either work or help us figure out what's going wrong.
Signed-off-by: Myles Watson mylesgw@gmail.com
Thanks, Myles
Myles,
It worked. Coreboot boots with SB_HT_CHAIN_ON_BUS_0 set to one and the previous patch. Seabios also detects the network devices without setting CONFIG_PCI_ROOT1 to 0x02. Attached is the output. Seabios was compiled with DEBUG_LEVEL set to 8.
On Wed, Nov 4, 2009 at 12:06 PM, Hugh Greenberg hng@lanl.gov wrote:
Myles,
It worked. Coreboot boots with SB_HT_CHAIN_ON_BUS_0 set to one and the previous patch. Seabios also detects the network devices without setting CONFIG_PCI_ROOT1 to 0x02. Attached is the output. Seabios was compiled with DEBUG_LEVEL set to 8.
ERROR - cound not find bus for node 0 chain 0, using defaults ERROR - could not find PCI 1:01.0, using defaults ERROR - could not find PCI 1:02.0, using defaults Wrote the mp table end at: 000f0410 - 000f05e4 Adding CBMEM entry as no. 3 Looking for bad PCIX MHz input Looking for bad Hot Swap Enable OK 133MHz & Hot Swap is off ERROR - cound not find bus for node 0 chain 0, using defaults ERROR - could not find PCI 1:01.0, using defaults ERROR - could not find PCI 1:02.0, using defaults Wrote the mp table end at: 3fff1410 - 3fff15e4
It almost worked. Can you try the attached patch?
Is it actually booting into Linux?
Thanks, Myles
Attached is the latest output. I don't see the errors anymore. I just tried booting Linux with the latest rom and it booted.
On Wed, Nov 4, 2009 at 2:48 PM, Hugh Greenberg hng@lanl.gov wrote:
Attached is the latest output. I don't see the errors anymore. I just tried booting Linux with the latest rom and it booted.
Great! So is everything finally working, or are you still fighting sporadic hangs?
We should commit the fixes so they don't get lost.
Thanks, Myles
Myles,
Everything seems to be working well. I had one sporadic hang while testing at what looked to be at the same place as the initial first hang. That was the only instance since I moved to gcc 3.4. We were testing so many things that I don't remember the exact scenario. I'm satisfied though. If I see it happening again, I'll try to capture it and send it to you.
You deserve a medal for this. Thanks a lot.
On Wed, Nov 4, 2009 at 4:23 PM, Hugh Greenberg hng@lanl.gov wrote:
Myles,
Everything seems to be working well. I had one sporadic hang while testing at what looked to be at the same place as the initial first hang. That was the only instance since I moved to gcc 3.4. We were testing so many things that I don't remember the exact scenario. I'm satisfied though. If I see it happening again, I'll try to capture it and send it to you.
Sounds good. I'd be interested in whether or not it hangs with the newer gcc still, too. Since everything else is working, it would be nice to know if that's really an issue.
You deserve a medal for this. Thanks a lot.
No problem. Would you mind sending svn diff from your working tree so I can make a patch? I've lost track of which patches you're using.
Thanks, Myles
Attached is the output from svn diff. I'll test it with gcc 4.4 and let you know if it hangs or not.
Yes, it is still hanging occasionally with a newer gcc. I thought I put all the debug stuff back in, but I apparently didn't get it all. Below is the output. I will put the remaining debug statements back in for this hang again and send the output.
coreboot-2.3 Wed Nov 4 16:40:23 MST 2009 starting... Enabling routing table for node 00 done. Enabling SMP settings (0,1) link=01 (1,0) link=01 setup_remote_node: done Renaming current temporary node to 01 done. Enabling routing table for node 01 done. 02 nodes initialized. coherent_ht_finalize done started ap apicid: SBLink=00 NC node|link=00 entering ht_optimize_link pos=0x8a, unfiltered freq_cap=0x8075 pos=0x8a, filtered freq_cap=0x35 pos=0xce, unfiltered freq_cap=0x35 freq_cap1=0x35, freq_cap2=0x15 dev1 old_freq=0x0, freq=0x4, needs_reset=0x1 dev2 old_freq=0x0, freq=0x4, needs_reset=0x1 width_cap1=0x11, width_cap2=0x11 dev1 input ln_width1=0x4, ln_width2=0x4 dev1 input width=0x1 dev1 output ln_width1=0x4, ln_width2=0x4 dev1 input|output width=0x11 old dev1 input|output width=0x11 dev2 input|output width=0x11 old dev2 input|output width=0x11 entering ht_optimize_link pos=0xd2, unfiltered freq_cap=0x35 pos=0xce, unfiltered freq_cap=0x1 pos=0xce, filtered freq_cap=0x1 freq_cap1=0x15, freq_cap2=0x1 dev1 old_freq=0x0, freq=0x0, needs_reset=0x0 dev2 old_freq=0x0, freq=0x0, needs_reset=0x0 width_cap1=0x0, width_cap2=0x0 dev1 input ln_width1=0x3, ln_width2=0x3 dev1 input width=0x0 dev1 output ln_width1=0x3, ln_width2=0x3 dev1 input|output width=0x0 old dev1 input|output width=0x0 dev2 input|output width=0x0 old dev2 input|output width=0x0 ht reset -
coreboot-2.3 Wed Nov 4 16:40:23 MST 2009 starting... Enabling routing table for node 00 done. Enabling SMP settings (0,1) link=01 (1,0) link=01 setup_remote_node: done Renaming current temporary node to 01 done. Enabling routing table for node 01 done. 02 nodes initialized. coherent_ht_finalize done started ap apicid: SBLink=00 NC node|link=00 entering ht_optimize_link pos=0x8a, unfiltered freq_cap=0x8075 pos=0x8a, filtered freq_cap=0x35 pos=0xce, unfiltered freq_cap=0x35 freq_cap1=0x35, freq_cap2=0x15 dev1 old_freq=0x4, freq=0x4, needs_reset=0x0 dev2 old_freq=0x4, freq=0x4, needs_reset=0x0 width_cap1=0x11, width_cap2=0x11 dev1 input ln_width1=0x4, ln_width2=0x4 dev1 input width=0x1 dev1 output ln_width1=0x4, ln_width2=0x4 dev1 input|output width=0x11 old dev1 input|output width=0x11 dev2 input|output width=0x11 old dev2 input|output width=0x11 entering ht_optimize_link pos=0xd2, unfiltered freq_cap=0x35 pos=0xce, unfiltered freq_cap=0x1 pos=0xce, filtered freq_cap=0x1 freq_cap1=0x15, freq_cap2=0x1 dev1 old_freq=0x0, freq=0x0, needs_reset=0x0 dev2 old_freq=0x0, freq=0x0, needs_reset=0x0 width_cap1=0x0, width_cap2=0x0 dev1 input ln_width1=0x3, ln_width2=0x3 dev1 input width=0x0 dev1 output ln_width1=0x3, ln_width2=0x3 dev1 input|output width=0x0 old dev1 input|output width=0x0 dev2 input|output width=0x0 old dev2 input|output width=0x0 SMBus controller enabled Ram1.00 setting up CPU00 northbridge registers done. Ram1.01 setting up CPU01 northbridge registers done. Ram2.00 Enabling dual channel memory Registered 166Mhz RAM end at 0x00100000 kB Lower RAM end at 0x00100000 kB Ram2.01 Enabling dual channel memory Registered 166Mhz RAM end at 0x00200000 kB Lower RAM end at 0x00200000 kB Ram3
Yes, it is still hanging occasionally with a newer gcc.
Too bad. Occasionally like 1-in-3? I don't have any idea what the compiler would have to do with non-deterministic hangs.
I thought I put all the debug stuff back in, but I apparently didn't get it all. Below is the output. I will put the remaining debug statements back in for this hang again and send the output.
OK. The other thing to find out is if you still need the "Forcing Type 1" hack. Everything else is nearly trivial since it only affects your board, and you've tested it.
Thanks, Myles
Sometimes it takes a very long time to reproduce. It is very random. I just tried about 30 times and I was not able to reproduce it. Ok, I will see if I still need that.
It doesn't seem like I need the forcing type 1 hack anymore.
I reproduced the hang with some of the debugging statements back in. I hope that is all you need. Here is the output:
coreboot-2.3 Wed Nov 4 17:28:13 MST 2009 starting... Enabling routing table for node 00 done. Enabling SMP settings (0,1) link=01 (1,0) link=01 setup_remote_node: done Renaming current temporary node to 01 done. Enabling routing table for node 01 done. 02 nodes initialized. coherent_ht_finalize done started ap apicid: SBLink=00 NC node|link=00 entering ht_optimize_link pos=0x8a, unfiltered freq_cap=0x8075 pos=0x8a, filtered freq_cap=0x35 pos=0xce, unfiltered freq_cap=0x35 freq_cap1=0x35, freq_cap2=0x15 dev1 old_freq=0x0, freq=0x4, needs_reset=0x1 dev2 old_freq=0x0, freq=0x4, needs_reset=0x1 width_cap1=0x11, width_cap2=0x11 dev1 input ln_width1=0x4, ln_width2=0x4 dev1 input width=0x1 dev1 output ln_width1=0x4, ln_width2=0x4 dev1 input|output width=0x11 old dev1 input|output width=0x11 dev2 input|output width=0x11 old dev2 input|output width=0x11 entering ht_optimize_link pos=0xd2, unfiltered freq_cap=0x35 pos=0xce, unfiltered freq_cap=0x1 pos=0xce, filtered freq_cap=0x1 freq_cap1=0x15, freq_cap2=0x1 dev1 old_freq=0x0, freq=0x0, needs_reset=0x0 dev2 old_freq=0x0, freq=0x0, needs_reset=0x0 width_cap1=0x0, width_cap2=0x0 dev1 input ln_width1=0x3, ln_width2=0x3 dev1 input width=0x0 dev1 output ln_width1=0x3, ln_width2=0x3 dev1 input|output width=0x0 old dev1 input|output width=0x0 dev2 input|output width=0x0 old dev2 input|output width=0x0 ht reset -
coreboot-2.3 Wed Nov 4 17:28:13 MST 2009 starting... Enabling routing table for node 00 done. Enabling SMP settings (0,1) link=01 (1,0) link=01 setup_remote_node: done Renaming current temporary node to 01 done. Enabling routing table for node 01 done. 02 nodes initialized. coherent_ht_finalize done started ap apicid: SBLink=00 NC node|link=00 entering ht_optimize_link pos=0x8a, unfiltered freq_cap=0x8075 pos=0x8a, filtered freq_cap=0x35 pos=0xce, unfiltered freq_cap=0x35 freq_cap1=0x35, freq_cap2=0x15 dev1 old_freq=0x4, freq=0x4, needs_reset=0x0 dev2 old_freq=0x4, freq=0x4, needs_reset=0x0 width_cap1=0x11, width_cap2=0x11 dev1 input ln_width1=0x4, ln_width2=0x4 dev1 input width=0x1 dev1 output ln_width1=0x4, ln_width2=0x4 dev1 input|output width=0x11 old dev1 input|output width=0x11 dev2 input|output width=0x11 old dev2 input|output width=0x11 entering ht_optimize_link pos=0xd2, unfiltered freq_cap=0x35 pos=0xce, unfiltered freq_cap=0x1 pos=0xce, filtered freq_cap=0x1 freq_cap1=0x15, freq_cap2=0x1 dev1 old_freq=0x0, freq=0x0, needs_reset=0x0 dev2 old_freq=0x0, freq=0x0, needs_reset=0x0 width_cap1=0x0, width_cap2=0x0 dev1 input ln_width1=0x3, ln_width2=0x3 dev1 input width=0x0 dev1 output ln_width1=0x3, ln_width2=0x3 dev1 input|output width=0x0 old dev1 input|output width=0x0 dev2 input|output width=0x0 old dev2 input|output width=0x0 SMBus controller enabled Ram1.00 setting up CPU00 northbridge registers done. Ram1.01 setting up CPU01 northbridge registers done. Ram2.00 Enabling dual channel memory Registered 166Mhz RAM end at 0x00100000 kB Lower RAM end at 0x00100000 kB Ram2.01 Enabling dual channel memory Registered 166Mhz RAM end at 0x00200000 kB Lower RAM end at 0x00200000 kB Ram3 Before starting clocks: Before memreset:
It doesn't seem like I need the forcing type 1 hack anymore.
Good.
I reproduced the hang with some of the debugging statements back in.
Thanks.
I hope that is all you need.
Yeah. It took a little too long to figure that one out :) I think we'll call it good for now.
Here is the output: Ram3 Before starting clocks: Before memreset:
I guess we should put the timeout back in so that it doesn't hang. I really don't understand why that would be compiler dependent.
I'll send the patch to the list tomorrow.
Thanks, Myles
On Wed, Oct 21, 2009 at 03:03:44PM -0600, Hugh Greenberg wrote:
Alight! It booted and loaded seabios. Attached is the entire output. One thing though, it failed to load gpxe. The only thing related to this that looks like an error is:
File pci14e4,16a6.rom is of type 63000000 instead oftype 30 and this PCI Expansion ROM, signature 0x0000, INIT size 0x0000, data ptr 0x0000 Incorrect Expansion ROM Header Signature 0000
I got this file from gpxe's rom O matic. I tried getting a new one just in case and got the same error.
Please compile SeaBIOS with the debug level set to 8 (#define CONFIG_DEBUG_LEVEL 8 in src/config.h) and then resend the full output.
-Kevin