Ron,
Have you ever tested myrinet card on K8 MB under LinuxBIOS?
I have tested the Mellanox IB card, and got problem too. And it also use prefmem.
Regards
YH
-----邮件原件----- 发件人: YhLu 发送时间: 2004年2月26日 20:38 收件人: ebiederman@lnxi.com; LinuxBIOS 主题: Prefmem of bus 3
Eric,
I'm trying Quadrics Card in S2882. When the card is plugged in, can not go through the Etherboot. Stuck there.
I found the premem of bus calculating seems got some problem.
PCI: 01:02.0 1c <- [0x00002000 - 0x00001fff] bus 3 io PCI: 01:02.0 24 <- [0xe0000000 - 0xffffffff] bus 3 prefmem PCI: 01:02.0 20 <- [0xf5200000 - 0xf51fffff] bus 3 mem ASSIGN RESOURCES, bus 3 PCI: 03:03.0 10 <- [0xe0000000 - 0xefffffff] prefmem PCI: 03:03.0 18 <- [0xf0000000 - 0xf3ffffff] prefmem ASSIGNED RESOURCES, bus 3
The bus 3 prefmem should be PCI: 01:02.0 24 <- [0xe0000000 - 0xf3ffffff] bus 3 prefmem
Regards
YH
Allocating resources... PCI: 04:01.0 missing read_resources PCI: 04:01.0 missing read_resources PCI: 04:01.0 missing read_resources PCI: 01:04.5 missing read_resources PCI: 01:04.6 missing read_resources PCI: 01:04.5 missing read_resources PCI: 01:04.6 missing read_resources ASSIGN RESOURCES, bus 0 PCI: 01:04.5 missing read_resources PCI: 01:04.6 missing read_resources PCI: 00:18.0 c0 <- [0x00001000 - 0x00002fff] node 0 link 0 io PCI: 01:04.5 missing read_resources PCI: 01:04.6 missing read_resources PCI: 00:18.0 b8 <- [0xe0000000 - 0xf52fffff] node 0 link 0 mem ASSIGN RESOURCES, bus 1 PCI: 01:01.0 1c <- [0x00002000 - 0x00001fff] bus 2 io PCI: 01:01.0 24 <- [0xf5200000 - 0xf51fffff] bus 2 prefmem PCI: 01:01.0 20 <- [0xf5100000 - 0xf51fffff] bus 2 mem ASSIGN RESOURCES, bus 2 PCI: 02:09.0 10 <- [0xf5100000 - 0xf510ffff] mem PCI: 02:09.0 18 <- [0xf5110000 - 0xf511ffff] mem PCI: 02:09.1 10 <- [0xf5120000 - 0xf512ffff] mem PCI: 02:09.1 18 <- [0xf5130000 - 0xf513ffff] mem ASSIGNED RESOURCES, bus 2 PCI: 01:01.1 10 <- [0xf5200000 - 0xf5200fff] mem PCI: 01:02.0 1c <- [0x00002000 - 0x00001fff] bus 3 io PCI: 01:02.0 24 <- [0xe0000000 - 0xffffffff] bus 3 prefmem PCI: 01:02.0 20 <- [0xf5200000 - 0xf51fffff] bus 3 mem ASSIGN RESOURCES, bus 3 PCI: 03:03.0 10 <- [0xe0000000 - 0xefffffff] prefmem PCI: 03:03.0 18 <- [0xf0000000 - 0xf3ffffff] prefmem ASSIGNED RESOURCES, bus 3 PCI: 01:02.1 10 <- [0xf5201000 - 0xf5201fff] mem PCI: 04:01.0 missing read_resources PCI: 01:03.0 1c <- [0x00001000 - 0x00001fff] bus 4 io PCI: 04:01.0 missing read_resources PCI: 01:03.0 24 <- [0xf5200000 - 0xf51fffff] bus 4 prefmem PCI: 04:01.0 missing read_resources PCI: 01:03.0 20 <- [0xf4000000 - 0xf5ffffff] bus 4 mem ASSIGN RESOURCES, bus 4 PCI: 04:00.0 10 <- [0xf5020000 - 0xf5020fff] mem PCI: 04:00.1 10 <- [0xf5021000 - 0xf5021fff] mem PCI: 04:00.2 10 <- [0xf5025000 - 0xf50250ff] mem PCI: 04:00.2 14 <- [0xf5026000 - 0xf502601f] mem PCI: 04:01.0 missing set_resources PCI: 04:05.0 10 <- [0x00001450 - 0x00001457] io PCI: 04:05.0 14 <- [0x00001470 - 0x00001473] io PCI: 04:05.0 18 <- [0x00001460 - 0x00001467] io PCI: 04:05.0 1c <- [0x00001480 - 0x00001483] io PCI: 04:05.0 20 <- [0x00001440 - 0x0000144f] io PCI: 04:05.0 24 <- [0xf5024000 - 0xf50243ff] mem PCI: 04:06.0 10 <- [0xf4000000 - 0xf4ffffff] mem PCI: 04:06.0 14 <- [0x00001000 - 0x000010ff] io PCI: 04:06.0 18 <- [0xf5022000 - 0xf5022fff] mem PCI: 04:08.0 10 <- [0xf5023000 - 0xf5023fff] mem PCI: 04:08.0 14 <- [0x00001400 - 0x0000143f] io PCI: 04:08.0 18 <- [0xf5000000 - 0xf501ffff] mem ASSIGNED RESOURCES, bus 4 PCI: 01:04.0 00 <- [0x00000000 - 0xffffffff] io PCI: 01:04.0 00 <- [0x00000000 - 0xffffffff] mem ASSIGN RESOURCES, bus 0 PNP: 002e.0 60 <- [0x000003f0 - 0x000003f7 io] PNP: 002e.0 70 <- [0x00000006 - 0x00000006 irq] PNP: 002e.0 74 <- [0x00000002 - 0x00000002 drq] PNP: 002e.1 60 <- [0x00000378 - 0x0000037f io] PNP: 002e.1 70 <- [0x00000007 - 0x00000007 irq] PNP: 002e.1 74 <- [0x00000004 - 0x00000004 drq] PNP: 002e.2 60 <- [0x000003f8 - 0x000003ff io] PNP: 002e.2 70 <- [0x00000004 - 0x00000004 irq] PNP: 002e.3 60 <- [0x000002f8 - 0x000002ff io] PNP: 002e.3 70 <- [0x00000003 - 0x00000003 irq] PNP: 002e.5 60 <- [0x00000060 - 0x00000060 io] PNP: 002e.5 62 <- [0x00000064 - 0x00000064 io] PNP: 002e.5 70 <- [0x00000001 - 0x00000001 irq] PNP: 002e.5 72 <- [0x0000000c - 0x0000000c irq] PNP: 002e.6 60 <- [0x00002030 - 0x00002037 io] ERROR: PNP: 002e.6 70 not allocated PNP: 002e.7 60 <- [0x00000201 - 0x00000201 io] PNP: 002e.7 62 <- [0x00000330 - 0x00000330 io] PNP: 002e.7 70 <- [0x00000009 - 0x00000009 irq] ERROR: PNP: 002e.a 70 not allocated PNP: 002e.b 60 <- [0x00002040 - 0x00002047 io] ERROR: PNP: 002e.b 70 not allocated ASSIGNED RESOURCES, bus 0 PCI: 01:04.1 20 <- [0x00002020 - 0x0000202f] io PCI: 01:04.2 10 <- [0x00002000 - 0x0000201f] io PCI: 01:04.5 missing set_resources PCI: 01:04.6 missing set_resources ASSIGNED RESOURCES, bus 1 ASSIGNED RESOURCES, bus 0 Allocating VGA resource done. _______________________________________________ Linuxbios mailing list Linuxbios@clustermatic.org http://www.clustermatic.org/mailman/listinfo/linuxbios
On Fri, 27 Feb 2004, YhLu wrote:
Have you ever tested myrinet card on K8 MB under LinuxBIOS?
we have a 1408-node K8 machine with myrinet working under linuxbios, and a 256-node K8 machine working with linuxbios. Arima HDAMA mainboards.
I wonder what's going on here, I have not had time to look in detail.
ron
ron minnich rminnich@lanl.gov writes:
On Fri, 27 Feb 2004, YhLu wrote:
Have you ever tested myrinet card on K8 MB under LinuxBIOS?
we have a 1408-node K8 machine with myrinet working under linuxbios, and a 256-node K8 machine working with linuxbios. Arima HDAMA mainboards.
I wonder what's going on here, I have not had time to look in detail.
So the problem appears to be that the prefmem region on the upper bus is to large.
I wonder if why this work on LANL clusters and not on others is the result of code skew.
Lightning nodes with myrinet only seem to have one prefmem region.
Yet another reminder that I really need to finish syncing up the code bases.
Eric
On Mon, 2004-03-01 at 11:48, Eric W. Biederman wrote:
ron minnich rminnich@lanl.gov writes:
On Fri, 27 Feb 2004, YhLu wrote:
Have you ever tested myrinet card on K8 MB under LinuxBIOS?
we have a 1408-node K8 machine with myrinet working under linuxbios, and a 256-node K8 machine working with linuxbios. Arima HDAMA mainboards.
I wonder what's going on here, I have not had time to look in detail.
So the problem appears to be that the prefmem region on the upper bus is to large.
I wonder if why this work on LANL clusters and not on others is the result of code skew.
Lightning nodes with myrinet only seem to have one prefmem region.
Yet another reminder that I really need to finish syncing up the code bases.
I think there is some problem in the resource allocation code so it can not handle devices with 2 prefmem.
Is the code just picking the "largest" resource ? I am really confused by the code in devices.c
Ollie
Li-Ta Lo ollie@lanl.gov writes:
I think there is some problem in the resource allocation code so it can not handle devices with 2 prefmem.
Maybe. I know there was some kind of problem on the opteron, early on but I forget what. The code for all resource types is the same so if we can handle two resources on a bus of the same type it should not be quite as simple as that.
Is the code just picking the "largest" resource ? I am really confused by the code in devices.c
No.
The code should be quite simple but it is recursive and highly abstracted which makes it hard to follow.
The high level overview is the code works in two passes. The first pass is to determine the size of the resource window needed. The second pass is to determine the actual resource assignments.
There are many was to assign resources on a bus. After some experiences with tight memory situations I implemented a near optimal solution. The solution is optimal if all of your resources are a power of 2 in size.
Basically the code is a loop. For each iteration the code finds the largest unassigned resource. Then the resource constraints of that resource are considered and padding between the previous resources and the current resources are inserted if necessary. Then we get into the next iteration.
The reason this is optimal if all of your resources are a power of two in size is because if your previous resource is a larger or equal power of two no padding will be needed for the current resource.
The situation Yhlu has is below and it is weird. The resources are assigned properly but when the are clumped together into a range register on the bus that value is incorrect. Which is very weird.
YhLu has:
ASSIGN RESOURCES, bus 3 PCI: 03:03.0 10 <- [0xe0000000 - 0xefffffff] prefmem PCI: 03:03.0 18 <- [0xf0000000 - 0xf3ffffff] prefmem
And then on bus 2:
PCI: 01:02.0 24 <- [0xe0000000 - 0xffffffff] bus 3 prefmem
The bus 3 prefmem should be PCI: 01:02.0 24 <- [0xe0000000 - 0xf3ffffff] bus 3 prefmem
So it looks like stuck bits or something.
Ollie can you get a boot log from Orange? Unless they are different types of infiniband adapters things should be fairly comparable.
Eric
On Mon, 2004-03-01 at 14:06, Eric W. Biederman wrote:
There are many was to assign resources on a bus. After some experiences with tight memory situations I implemented a near optimal solution. The solution is optimal if all of your resources are a power of 2 in size.
Basically the code is a loop. For each iteration the code finds the largest unassigned resource. Then the resource constraints of that resource are considered and padding between the previous resources and the current resources are inserted if necessary. Then we get into the next iteration.
I still don't understand this. Do you mean that now the resource allocation is not sequential (in the order of devices been enumerated) and not continuous (there are gaps in allocated address) ?
Ollie
The reason this is optimal if all of your resources are a power of two in size is because if your previous resource is a larger or equal power of two no padding will be needed for the current resource.
Li-Ta Lo ollie@lanl.gov writes:
On Mon, 2004-03-01 at 14:06, Eric W. Biederman wrote:
There are many was to assign resources on a bus. After some experiences with tight memory situations I implemented a near optimal solution. The solution is optimal if all of your resources are a power of 2 in size.
Basically the code is a loop. For each iteration the code finds the largest unassigned resource. Then the resource constraints of that resource are considered and padding between the previous resources and the current resources are inserted if necessary. Then we get into the next iteration.
I still don't understand this. Do you mean that now the resource allocation is not sequential (in the order of devices been enumerated) and not continuous (there are gaps in allocated address) ?
Correct.
The reason the allocation is not sequential (in the order of devices been enumerated) is to reduce the number of gaps in the allocated addresses.
So a thought problem to help put this in perspective.
First all resources are a power of 2 in size must be that same power of 2 aligned.
So if I have 3 resources A,B,C with the following sizes:
A: 8K B: 256M C: 4K
Allocating them sequentially would use 756M of address space. After allocating A you need nearly 256M of padding to get B 256M aligned. C takes nearly 256M due to padding.
Allocating them largest to smallest (B,A,C) uses just 512M of address space. Because the alignment necessary for A is present at the end of B, and the alignment necessary for A is present at the end of C.
Eric
The reason this is optimal if all of your resources are a power of two in size is because if your previous resource is a larger or equal power of two no padding will be needed for the current resource.
On Fri, 2004-02-27 at 12:25, YhLu wrote:
Ron,
Have you ever tested myrinet card on K8 MB under LinuxBIOS?
I have tested the Mellanox IB card, and got problem too. And it also use prefmem.
The device has two prefmem regions ?
Ollie
Regards
YH
-----邮件原件----- 发件人: YhLu 发送时间: 2004年2月26日 20:38 收件人: ebiederman@lnxi.com; LinuxBIOS 主题: Prefmem of bus 3
Eric,
I'm trying Quadrics Card in S2882. When the card is plugged in, can not go through the Etherboot. Stuck there.
I found the premem of bus calculating seems got some problem.
PCI: 01:02.0 1c <- [0x00002000 - 0x00001fff] bus 3 io PCI: 01:02.0 24 <- [0xe0000000 - 0xffffffff] bus 3 prefmem PCI: 01:02.0 20 <- [0xf5200000 - 0xf51fffff] bus 3 mem ASSIGN RESOURCES, bus 3 PCI: 03:03.0 10 <- [0xe0000000 - 0xefffffff] prefmem PCI: 03:03.0 18 <- [0xf0000000 - 0xf3ffffff] prefmem ASSIGNED RESOURCES, bus 3
The bus 3 prefmem should be PCI: 01:02.0 24 <- [0xe0000000 - 0xf3ffffff] bus 3 prefmem
Regards
YH
Allocating resources... PCI: 04:01.0 missing read_resources PCI: 04:01.0 missing read_resources PCI: 04:01.0 missing read_resources PCI: 01:04.5 missing read_resources PCI: 01:04.6 missing read_resources PCI: 01:04.5 missing read_resources PCI: 01:04.6 missing read_resources ASSIGN RESOURCES, bus 0 PCI: 01:04.5 missing read_resources PCI: 01:04.6 missing read_resources PCI: 00:18.0 c0 <- [0x00001000 - 0x00002fff] node 0 link 0 io PCI: 01:04.5 missing read_resources PCI: 01:04.6 missing read_resources PCI: 00:18.0 b8 <- [0xe0000000 - 0xf52fffff] node 0 link 0 mem ASSIGN RESOURCES, bus 1 PCI: 01:01.0 1c <- [0x00002000 - 0x00001fff] bus 2 io PCI: 01:01.0 24 <- [0xf5200000 - 0xf51fffff] bus 2 prefmem PCI: 01:01.0 20 <- [0xf5100000 - 0xf51fffff] bus 2 mem ASSIGN RESOURCES, bus 2 PCI: 02:09.0 10 <- [0xf5100000 - 0xf510ffff] mem PCI: 02:09.0 18 <- [0xf5110000 - 0xf511ffff] mem PCI: 02:09.1 10 <- [0xf5120000 - 0xf512ffff] mem PCI: 02:09.1 18 <- [0xf5130000 - 0xf513ffff] mem ASSIGNED RESOURCES, bus 2 PCI: 01:01.1 10 <- [0xf5200000 - 0xf5200fff] mem PCI: 01:02.0 1c <- [0x00002000 - 0x00001fff] bus 3 io PCI: 01:02.0 24 <- [0xe0000000 - 0xffffffff] bus 3 prefmem PCI: 01:02.0 20 <- [0xf5200000 - 0xf51fffff] bus 3 mem ASSIGN RESOURCES, bus 3 PCI: 03:03.0 10 <- [0xe0000000 - 0xefffffff] prefmem PCI: 03:03.0 18 <- [0xf0000000 - 0xf3ffffff] prefmem ASSIGNED RESOURCES, bus 3 PCI: 01:02.1 10 <- [0xf5201000 - 0xf5201fff] mem PCI: 04:01.0 missing read_resources PCI: 01:03.0 1c <- [0x00001000 - 0x00001fff] bus 4 io PCI: 04:01.0 missing read_resources PCI: 01:03.0 24 <- [0xf5200000 - 0xf51fffff] bus 4 prefmem PCI: 04:01.0 missing read_resources PCI: 01:03.0 20 <- [0xf4000000 - 0xf5ffffff] bus 4 mem ASSIGN RESOURCES, bus 4 PCI: 04:00.0 10 <- [0xf5020000 - 0xf5020fff] mem PCI: 04:00.1 10 <- [0xf5021000 - 0xf5021fff] mem PCI: 04:00.2 10 <- [0xf5025000 - 0xf50250ff] mem PCI: 04:00.2 14 <- [0xf5026000 - 0xf502601f] mem PCI: 04:01.0 missing set_resources PCI: 04:05.0 10 <- [0x00001450 - 0x00001457] io PCI: 04:05.0 14 <- [0x00001470 - 0x00001473] io PCI: 04:05.0 18 <- [0x00001460 - 0x00001467] io PCI: 04:05.0 1c <- [0x00001480 - 0x00001483] io PCI: 04:05.0 20 <- [0x00001440 - 0x0000144f] io PCI: 04:05.0 24 <- [0xf5024000 - 0xf50243ff] mem PCI: 04:06.0 10 <- [0xf4000000 - 0xf4ffffff] mem PCI: 04:06.0 14 <- [0x00001000 - 0x000010ff] io PCI: 04:06.0 18 <- [0xf5022000 - 0xf5022fff] mem PCI: 04:08.0 10 <- [0xf5023000 - 0xf5023fff] mem PCI: 04:08.0 14 <- [0x00001400 - 0x0000143f] io PCI: 04:08.0 18 <- [0xf5000000 - 0xf501ffff] mem ASSIGNED RESOURCES, bus 4 PCI: 01:04.0 00 <- [0x00000000 - 0xffffffff] io PCI: 01:04.0 00 <- [0x00000000 - 0xffffffff] mem ASSIGN RESOURCES, bus 0 PNP: 002e.0 60 <- [0x000003f0 - 0x000003f7 io] PNP: 002e.0 70 <- [0x00000006 - 0x00000006 irq] PNP: 002e.0 74 <- [0x00000002 - 0x00000002 drq] PNP: 002e.1 60 <- [0x00000378 - 0x0000037f io] PNP: 002e.1 70 <- [0x00000007 - 0x00000007 irq] PNP: 002e.1 74 <- [0x00000004 - 0x00000004 drq] PNP: 002e.2 60 <- [0x000003f8 - 0x000003ff io] PNP: 002e.2 70 <- [0x00000004 - 0x00000004 irq] PNP: 002e.3 60 <- [0x000002f8 - 0x000002ff io] PNP: 002e.3 70 <- [0x00000003 - 0x00000003 irq] PNP: 002e.5 60 <- [0x00000060 - 0x00000060 io] PNP: 002e.5 62 <- [0x00000064 - 0x00000064 io] PNP: 002e.5 70 <- [0x00000001 - 0x00000001 irq] PNP: 002e.5 72 <- [0x0000000c - 0x0000000c irq] PNP: 002e.6 60 <- [0x00002030 - 0x00002037 io] ERROR: PNP: 002e.6 70 not allocated PNP: 002e.7 60 <- [0x00000201 - 0x00000201 io] PNP: 002e.7 62 <- [0x00000330 - 0x00000330 io] PNP: 002e.7 70 <- [0x00000009 - 0x00000009 irq] ERROR: PNP: 002e.a 70 not allocated PNP: 002e.b 60 <- [0x00002040 - 0x00002047 io] ERROR: PNP: 002e.b 70 not allocated ASSIGNED RESOURCES, bus 0 PCI: 01:04.1 20 <- [0x00002020 - 0x0000202f] io PCI: 01:04.2 10 <- [0x00002000 - 0x0000201f] io PCI: 01:04.5 missing set_resources PCI: 01:04.6 missing set_resources ASSIGNED RESOURCES, bus 1 ASSIGNED RESOURCES, bus 0 Allocating VGA resource done. _______________________________________________ Linuxbios mailing list Linuxbios@clustermatic.org http://www.clustermatic.org/mailman/listinfo/linuxbios _______________________________________________ Linuxbios mailing list Linuxbios@clustermatic.org http://www.clustermatic.org/mailman/listinfo/linuxbios