I can't figure out how to track this down. I get serial corruption and missing output lines. The difference in the setup is that one boot has a module installed and the other one doesn't. The broken one doesnt :(
Look at the first two lines of the logs...
How would I go about tracking this down? What could eat my serial output? I'm on a single processor machine.
Full logs available if anyone wants them.
BTW: I've never gotten scan-build to work right on my machine. Patrick or Stefan, do you have a new run available somewhere?
Thanks, Myles
Broken log:
PCI: 00:0d.0 compute_allocate_resource 0000a0 gran: 20 done mputeemem: base: fff00000 size: 00000000 align: 20 gran: 20 PCI: 00:0d.0 read_resources bCI00e:0:01.00.0 1820x00002087] io PCI: 00:07.0 14 * [0x00002090 - 0x00002093] io PCI: 00:07.0 1c * [0x000020a0 - 0x000020a3] io PCI: 00:08.0 14 * [0x000020b0 - 0x000020b3] io PCI: 00:08.0 1c * [0x000020c0 - 0x000020c3] io
OK log: PCI: 00:0d.0 compute_allocate_resource prefmem: base: 00000000 size: 00000000 align: 20 gran: 20 done PCI: 00:0d.0 compute_allocate_resource prefmem: base: fff00000 size: 00000000 align: 20 gran: 20 PCI: 00:0d.0 read_resources bus 2 link: 0 PCI: 00:0d.0 read_resources bus 2 link: 0 done PCI: 00:0d.0 compute_allocate_resource prefmem: base: fff00000 size: 00000000 align: 20 gran: 20 done PCI: 00:0d.0 24 <- [0x00fff00000 - 0x00ffefffff] size 0x00000000 gran 0x14 bus 02 prefmem PCI: 00:0d.0 compute_allocate_resource mem: base: 00000000 size: 00000000 align: 20 gran: 20 PCI: 00:0d.0 read_resources bus 2 link: 0 PCI: 00:0d.0 read_resources bus 2 link: 0 done PCI: 00:0d.0 compute_allocate_resource mem: base: 00000000 size: 00000000 align: 20 gran: 20 done PCI: 00:0d.0 compute_allocate_resource mem: base: fff00000 size: 00000000 align: 20 gran: 20 PCI: 00:0d.0 read_resources bus 2 link: 0 PCI: 00:0d.0 read_resources bus 2 link: 0 done PCI: 00:0d.0 compute_allocate_resource mem: base: fff00000 size: 00000000 align: 20 gran: 20 done PCI: 00:0d.0 20 <- [0x00fff00000 - 0x00ffefffff] size 0x00000000 gran 0x14 bus 02 mem PCI: 00:0e.0 compute_allocate_resource io: base: 00000000 size: 00000000 align: 12 gran: 12 PCI: 00:0e.0 read_resources bus 3 link: 0 PCI: 00:0e.0 read_resources bus 3 link: 0 done PCI: 00:0e.0 compute_allocate_resource io: base: 00000000 size: 00000000 align: 12 gran: 12 done PCI: 00:0e.0 compute_allocate_resource io: base: fffff000 size: 00000000 align: 12 gran: 12 PCI: 00:0e.0 read_resources bus 3 link: 0 PCI: 00:0e.0 read_resources bus 3 link: 0 done PCI: 00:0e.0 compute_allocate_resource io: base: fffff000 size: 00000000 align: 12 gran: 12 done PCI: 00:0e.0 1c <- [0x00fffff000 - 0x00ffffefff] size 0x00000000 gran 0x0c bus 03 io PCI: 00:0e.0 compute_allocate_resource prefmem: base: 00000000 size: 00000000 align: 20 gran: 20 PCI: 00:0e.0 read_resources bus 3 link: 0 PCI: 00:0e.0 read_resources bus 3 link: 0 done PCI: 00:0e.0 compute_allocate_resource prefmem: base: 00000000 size: 00000000 align: 20 gran: 20 done PCI: 00:0e.0 compute_allocate_resource prefmem: base: fff00000 size: 00000000 align: 20 gran: 20 PCI: 00:0e.0 read_resources bus 3 link: 0 PCI: 00:0e.0 read_resources bus 3 link: 0 done PCI: 00:0e.0 compute_allocate_resource prefmem: base: fff00000 size: 00000000 align: 20 gran: 20 done PCI: 00:0e.0 24 <- [0x00fff00000 - 0x00ffefffff] size 0x00000000 gran 0x14 bus 03 prefmem PCI: 00:0e.0 compute_allocate_resource mem: base: 00000000 size: 00000000 align: 20 gran: 20 PCI: 00:0e.0 read_resources bus 3 link: 0 PCI: 00:0e.0 read_resources bus 3 link: 0 done PCI: 00:0e.0 compute_allocate_resource mem: base: 00000000 size: 00000000 align: 20 gran: 20 done PCI: 00:0e.0 compute_allocate_resource mem: base: fff00000 size: 00000000 align: 20 gran: 20 PCI: 00:0e.0 read_resources bus 3 link: 0 PCI: 00:0e.0 read_resources bus 3 link: 0 done PCI: 00:0e.0 compute_allocate_resource mem: base: fff00000 size: 00000000 align: 20 gran: 20 done PCI: 00:0e.0 20 <- [0x00fff00000 - 0x00ffefffff] size 0x00000000 gran 0x14 bus 03 mem PCI: 00:18.0 read_resources bus 0 link: 0 done PCI: 00:09.0 1c * [0x00000000 - 0x00000fff] io PCI: 00:01.0 60 * [0x00001000 - 0x000010ff] io PCI: 00:01.0 64 * [0x00001400 - 0x000014ff] io PCI: 00:01.0 68 * [0x00001800 - 0x000018ff] io PCI: 00:01.0 10 * [0x00001c00 - 0x00001c7f] io PCI: 00:01.1 20 * [0x00001c80 - 0x00001cbf] io PCI: 00:01.1 24 * [0x00001cc0 - 0x00001cff] io PCI: 00:01.1 10 * [0x00002000 - 0x0000201f] io PCI: 00:06.0 20 * [0x00002020 - 0x0000202f] io PCI: 00:07.0 20 * [0x00002030 - 0x0000203f] io PCI: 00:08.0 20 * [0x00002040 - 0x0000204f] io PCI: 00:07.0 10 * [0x00002050 - 0x00002057] io PCI: 00:07.0 18 * [0x00002060 - 0x00002067] io PCI: 00:08.0 10 * [0x00002070 - 0x00002077] io PCI: 00:08.0 18 * [0x00002080 - 0x00002087] io PCI: 00:07.0 14 * [0x00002090 - 0x00002093] io PCI: 00:07.0 1c * [0x000020a0 - 0x000020a3] io PCI: 00:08.0 14 * [0x000020b0 - 0x000020b3] io PCI: 00:08.0 1c * [0x000020c0 - 0x000020c3] io
Is this new breakage? What do you mean by 'module'?
ron
-----Original Message----- From: ron minnich [mailto:rminnich@gmail.com] Sent: Friday, May 08, 2009 4:57 PM To: Myles Watson Cc: coreboot; Patrick Georgi; Stefan Reinauer Subject: Re: [coreboot] Debug help
Is this new breakage? What do you mean by 'module'?
An FPGA instead of one of the processors. I've seen similar breakage before but it's been a long time. I thought it was because of format string problems. That was one of the reasons I was motivated to take out the compiler warnings.
Thanks, Myles
On Fri, May 8, 2009 at 7:35 PM, Myles Watson mylesgw@gmail.com wrote:
An FPGA instead of one of the processors. I've seen similar breakage before but it's been a long time. I thought it was because of format string problems. That was one of the reasons I was motivated to take out the compiler warnings.
So the status is 1. opteron in second socket -- it works 2. fpga in second socket -- it works 3. empty second socket -- it fails
is that it?
ron
An FPGA instead of one of the processors. I've seen similar breakage
before
but it's been a long time. I thought it was because of format string problems. That was one of the reasons I was motivated to take out the compiler warnings.
So the status is
- opteron in second socket -- it works
Works fine.
- fpga in second socket -- it works
I have resource allocation problems, but the debug messages look all right.
- empty second socket -- it fails
No resource allocation problems, but I have garbled serial port output.
Thanks, Myles
The only thing that occurs right off is that configuraitons are just different on the one missing an opteron.
How is HT wired up on this board? Some boards have a completely independent IO bus on each socket.
ron
On Fri, May 8, 2009 at 9:08 PM, ron minnich rminnich@gmail.com wrote:
The only thing that occurs right off is that configuraitons are just different on the one missing an opteron.
I should have been clearer. It's the same board. They both have the same module installed. In one case the module doesn't respond to configuration cycles and is disabled in hardware (recognized as an open socket.)
How is HT wired up on this board? Some boards have a completely independent IO bus on each socket.
The Opteron is connected to: Link 0: Nvidia ck804 Link 1: Opteron socket / FPGA Link 2: Amd 8132
The strange thing is that the missing/garbled messages are during resource allocation on LInk 0. I'll probably ignore it for a while longer if there's no way to track it down. I'm just worried that something is trashing memory, and that some time it will be a message that I need to see :)
Thanks, Myles
On Fri, May 8, 2009 at 8:14 PM, Myles Watson mylesgw@gmail.com wrote:
On Fri, May 8, 2009 at 9:08 PM, ron minnich rminnich@gmail.com wrote:
The only thing that occurs right off is that configuraitons are just different on the one missing an opteron.
I should have been clearer. It's the same board. They both have the same module installed. In one case the module doesn't respond to configuration cycles and is disabled in hardware (recognized as an open socket.)
understood. But one board I worked on had a full I/O bus on each opteron, which meant the pci config space changed radically with one removed.
How is HT wired up on this board? Some boards have a completely independent IO bus on each socket.
The Opteron is connected to: Link 0: Nvidia ck804 Link 1: Opteron socket / FPGA Link 2: Amd 8132
The strange thing is that the missing/garbled messages are during resource allocation on LInk 0. I'll probably ignore it for a while longer if there's no way to track it down. I'm just worried that something is trashing memory, and that some time it will be a message that I need to see :)
is it possible that the the act of allocating resources puts some config space registers into a state such that 0x3f8 is inaccessible or pointing to some other device? I hope this question even makes sense -- it's been a long day.
thanks
ron
-----Original Message----- From: ron minnich [mailto:rminnich@gmail.com] Sent: Friday, May 08, 2009 9:18 PM To: Myles Watson Cc: coreboot Subject: Re: [coreboot] Debug help
On Fri, May 8, 2009 at 8:14 PM, Myles Watson mylesgw@gmail.com wrote:
On Fri, May 8, 2009 at 9:08 PM, ron minnich rminnich@gmail.com wrote:
The only thing that occurs right off is that configuraitons are just different on the one missing an opteron.
I should have been clearer. It's the same board. They both have the same module installed. In one case the module doesn't respond to configuration cycles and is disabled in hardware (recognized as an open socket.)
understood. But one board I worked on had a full I/O bus on each opteron, which meant the pci config space changed radically with one removed.
Sorry. I don't know what a full I/O bus means. Do you mean each one had its own 16-bit I/O space? In that case no. This one just uses the normal k8 allocation code.
How is HT wired up on this board? Some boards have a completely independent IO bus on each socket.
The Opteron is connected to: Link 0: Nvidia ck804 Link 1: Opteron socket / FPGA Link 2: Amd 8132
The strange thing is that the missing/garbled messages are during resource allocation on LInk 0. I'll probably ignore it for a while longer if there's no way to track it down. I'm just worried that something is trashing memory, and that some time it will be a message that I need to see :)
is it possible that the the act of allocating resources puts some config space registers into a state such that 0x3f8 is inaccessible or pointing to some other device? I hope this question even makes sense -- it's been a long day.
It makes sense and it's possible, but if it's happening there is a really nasty bug somewhere. There are config-space registers in the Opteron that direct configuration accesses to specific links.
I don't think I'm seeing that here because at the end it is all set up correctly. There is just data corruption at the serial port. I guess it's possible that it is something much less sinister. Is there some way that the serial port could be misconfigured so that if you write to it too quickly you overflow the buffer and lose output?
Thanks, Myles
On Fri, May 8, 2009 at 8:26 PM, Myles Watson mylesgw@gmail.com wrote:
Is there some way that the serial port could be misconfigured so that if you write to it too quickly you overflow the buffer and lose output?
it could be a hardware glitch actually. The code tests the tx ready bit and won't send until it is set.
It's an interesting mystery.
ron
ron minnich wrote:
Is there some way that the serial port could be misconfigured so that if you write to it too quickly you overflow the buffer and lose output?
it could be a hardware glitch actually. The code tests the tx ready bit and won't send until it is set.
All UARTs these days have FIFO, which could fail in theory, if coreboot is using it? It's not so likely though..
//Peter
On 09.05.2009 0:45 Uhr, Myles Watson wrote:
I can't figure out how to track this down. I get serial corruption and missing output lines. The difference in the setup is that one boot has a module installed and the other one doesn't. The broken one doesnt :(
What module is that?
Look at the first two lines of the logs...
How would I go about tracking this down? What could eat my serial output? I'm on a single processor machine.
Full logs available if anyone wants them.
If it still applies...
BTW: I've never gotten scan-build to work right on my machine. Patrick or Stefan, do you have a new run available somewhere?
I started a new run yesterday evening, and I put it to http://www.coreboot.org/~stepan/coreboot-scanbuild moved the old one to http://www.coreboot.org/~stepan/coreboot-scanbuild.old but I'm going to wipe it in a day or too as these reports are quite big (11G each, while the coreboot.org SAS disks are clearly undersized)
Thanks, Myles
Broken log:
PCI: 00:0d.0 compute_allocate_resource 0000a0 gran: 20 done mputeemem: base: fff00000 size: 00000000 align: 20 gran: 20 PCI: 00:0d.0 read_resources bCI00e:0:01.00.0 1820x00002087] io PCI: 00:07.0 14 * [0x00002090 - 0x00002093] io PCI: 00:07.0 1c * [0x000020a0 - 0x000020a3] io PCI: 00:08.0 14 * [0x000020b0 - 0x000020b3] io PCI: 00:08.0 1c * [0x000020c0 - 0x000020c3] io
OK log: PCI: 00:0d.0 compute_allocate_resource prefmem: base: 00000000 size: 00000000 align: 20 gran: 20 done PCI: 00:0d.0 compute_allocate_resource prefmem: base: fff00000 size: 00000000 align: 20 gran: 20 PCI: 00:0d.0 read_resources bus 2 link: 0 PCI: 00:0d.0 read_resources bus 2 link: 0 done PCI: 00:0d.0 compute_allocate_resource prefmem: base: fff00000 size: 00000000 align: 20 gran: 20 done PCI: 00:0d.0 24 <- [0x00fff00000 - 0x00ffefffff] size 0x00000000 gran 0x14 bus 02 prefmem PCI: 00:0d.0 compute_allocate_resource mem: base: 00000000 size: 00000000 align: 20 gran: 20 PCI: 00:0d.0 read_resources bus 2 link: 0 PCI: 00:0d.0 read_resources bus 2 link: 0 done PCI: 00:0d.0 compute_allocate_resource mem: base: 00000000 size: 00000000 align: 20 gran: 20 done PCI: 00:0d.0 compute_allocate_resource mem: base: fff00000 size: 00000000 align: 20 gran: 20 PCI: 00:0d.0 read_resources bus 2 link: 0 PCI: 00:0d.0 read_resources bus 2 link: 0 done PCI: 00:0d.0 compute_allocate_resource mem: base: fff00000 size: 00000000 align: 20 gran: 20 done PCI: 00:0d.0 20 <- [0x00fff00000 - 0x00ffefffff] size 0x00000000 gran 0x14 bus 02 mem PCI: 00:0e.0 compute_allocate_resource io: base: 00000000 size: 00000000 align: 12 gran: 12 PCI: 00:0e.0 read_resources bus 3 link: 0 PCI: 00:0e.0 read_resources bus 3 link: 0 done PCI: 00:0e.0 compute_allocate_resource io: base: 00000000 size: 00000000 align: 12 gran: 12 done PCI: 00:0e.0 compute_allocate_resource io: base: fffff000 size: 00000000 align: 12 gran: 12 PCI: 00:0e.0 read_resources bus 3 link: 0 PCI: 00:0e.0 read_resources bus 3 link: 0 done PCI: 00:0e.0 compute_allocate_resource io: base: fffff000 size: 00000000 align: 12 gran: 12 done PCI: 00:0e.0 1c <- [0x00fffff000 - 0x00ffffefff] size 0x00000000 gran 0x0c bus 03 io PCI: 00:0e.0 compute_allocate_resource prefmem: base: 00000000 size: 00000000 align: 20 gran: 20 PCI: 00:0e.0 read_resources bus 3 link: 0 PCI: 00:0e.0 read_resources bus 3 link: 0 done PCI: 00:0e.0 compute_allocate_resource prefmem: base: 00000000 size: 00000000 align: 20 gran: 20 done PCI: 00:0e.0 compute_allocate_resource prefmem: base: fff00000 size: 00000000 align: 20 gran: 20 PCI: 00:0e.0 read_resources bus 3 link: 0 PCI: 00:0e.0 read_resources bus 3 link: 0 done PCI: 00:0e.0 compute_allocate_resource prefmem: base: fff00000 size: 00000000 align: 20 gran: 20 done PCI: 00:0e.0 24 <- [0x00fff00000 - 0x00ffefffff] size 0x00000000 gran 0x14 bus 03 prefmem PCI: 00:0e.0 compute_allocate_resource mem: base: 00000000 size: 00000000 align: 20 gran: 20 PCI: 00:0e.0 read_resources bus 3 link: 0 PCI: 00:0e.0 read_resources bus 3 link: 0 done PCI: 00:0e.0 compute_allocate_resource mem: base: 00000000 size: 00000000 align: 20 gran: 20 done PCI: 00:0e.0 compute_allocate_resource mem: base: fff00000 size: 00000000 align: 20 gran: 20 PCI: 00:0e.0 read_resources bus 3 link: 0 PCI: 00:0e.0 read_resources bus 3 link: 0 done PCI: 00:0e.0 compute_allocate_resource mem: base: fff00000 size: 00000000 align: 20 gran: 20 done PCI: 00:0e.0 20 <- [0x00fff00000 - 0x00ffefffff] size 0x00000000 gran 0x14 bus 03 mem PCI: 00:18.0 read_resources bus 0 link: 0 done PCI: 00:09.0 1c * [0x00000000 - 0x00000fff] io PCI: 00:01.0 60 * [0x00001000 - 0x000010ff] io PCI: 00:01.0 64 * [0x00001400 - 0x000014ff] io PCI: 00:01.0 68 * [0x00001800 - 0x000018ff] io PCI: 00:01.0 10 * [0x00001c00 - 0x00001c7f] io PCI: 00:01.1 20 * [0x00001c80 - 0x00001cbf] io PCI: 00:01.1 24 * [0x00001cc0 - 0x00001cff] io PCI: 00:01.1 10 * [0x00002000 - 0x0000201f] io PCI: 00:06.0 20 * [0x00002020 - 0x0000202f] io PCI: 00:07.0 20 * [0x00002030 - 0x0000203f] io PCI: 00:08.0 20 * [0x00002040 - 0x0000204f] io PCI: 00:07.0 10 * [0x00002050 - 0x00002057] io PCI: 00:07.0 18 * [0x00002060 - 0x00002067] io PCI: 00:08.0 10 * [0x00002070 - 0x00002077] io PCI: 00:08.0 18 * [0x00002080 - 0x00002087] io PCI: 00:07.0 14 * [0x00002090 - 0x00002093] io PCI: 00:07.0 1c * [0x000020a0 - 0x000020a3] io PCI: 00:08.0 14 * [0x000020b0 - 0x000020b3] io PCI: 00:08.0 1c * [0x000020c0 - 0x000020c3] io
On 09.05.2009 0:45 Uhr, Myles Watson wrote:
I can't figure out how to track this down. I get serial corruption and missing output lines. The difference in the setup is that one boot has a module installed and the other one doesn't. The broken one doesnt :(
What module is that?
XD1000.
Look at the first two lines of the logs...
How would I go about tracking this down? What could eat my serial output? I'm on a single processor machine.
I've decided that it's probably the fault of the box I'm running putty on. The output disappears in different places, but never affects the boot.
BTW: I've never gotten scan-build to work right on my machine. Patrick or Stefan, do you have a new run available somewhere?
I started a new run yesterday evening, and I put it to http://www.coreboot.org/~stepan/coreboot-scanbuild
Thanks!
moved the old one to http://www.coreboot.org/~stepan/coreboot-scanbuild.old but I'm going to wipe it in a day or too as these reports are quite big (11G each, while the coreboot.org SAS disks are clearly undersized)
Wow.
Is there some way to just keep the HTML error reports? They shouldn't be that big.
Thanks, Myles