Promote heap sizing to first-class Kconfig citizenship.
Changing the heap size is something that those, like me, with large PCI device trees need to do. Therefore heap size should appear as a normal, user-answerable question within the Kconfig build system.
Also change the malloc debug message to more clearly indicate how much memory is left.
Signed-off-by: Joe Korty joe.korty@ccur.com
Index: trunk/src/Kconfig =================================================================== --- trunk.orig/src/Kconfig 2010-05-14 10:24:35.000000000 -0400 +++ trunk/src/Kconfig 2010-05-14 10:25:00.000000000 -0400 @@ -80,6 +80,17 @@ Enables the use of ccache for faster builds. Requires ccache in path.
+config HEAP_SIZE + hex "Heap size (in bytes)" + default 0x4000 + help + The primary coreboot heap user is the PCI + bus walk. Therefore heap size may need to be + increased on systems that have exceptionally + large and/or deep PCI device trees. + + If unsure, use the default. + endmenu
source src/mainboard/Kconfig @@ -124,10 +135,6 @@ bool default n
-config HEAP_SIZE - hex - default 0x4000 - config DEBUG bool default n Index: trunk/src/lib/malloc.c =================================================================== --- trunk.orig/src/lib/malloc.c 2010-05-14 10:24:35.000000000 -0400 +++ trunk/src/lib/malloc.c 2010-05-14 10:25:00.000000000 -0400 @@ -14,7 +14,10 @@ { void *p;
- MALLOCDBG("%s Enter, size %ld, free_mem_ptr %p\n", __func__, size, free_mem_ptr); + MALLOCDBG("%s Enter, size %ld, %d of %d bytes available.\n", + __func__, size, + (int)(free_mem_end_ptr - free_mem_ptr), + (int)(&_eheap - &_heap));
/* Checking arguments */ if (size < 0)
On Fri, May 14, 2010 at 1:11 PM, Joe Korty joe.korty@ccur.com wrote:
Promote heap sizing to first-class Kconfig citizenship.
Changing the heap size is something that those, like me, with large PCI device trees need to do. Therefore heap size should appear as a normal, user-answerable question within the Kconfig build system.
I think the difference here is that you're a developer (not a user) once you start touching the code. Users shouldn't have to worry about the heap size. It should be set larger in your mainboard Kconfig if the mainboard needs more heap space.
Thanks, Myles
On 5/14/10 9:19 PM, Myles Watson wrote:
On Fri, May 14, 2010 at 1:11 PM, Joe Korty joe.korty@ccur.com wrote:
Promote heap sizing to first-class Kconfig citizenship.
Changing the heap size is something that those, like me, with large PCI device trees need to do. Therefore heap size should appear as a normal, user-answerable question within the Kconfig build system.
I think the difference here is that you're a developer (not a user) once you start touching the code. Users shouldn't have to worry about the heap size. It should be set larger in your mainboard Kconfig if the mainboard needs more heap space.
I agree with Myles here. If the heap size is not good enough, coreboot is broken and needs to be fixed.
Stefan
On Fri, May 14, 2010 at 03:19:45PM -0400, Myles Watson wrote:
On Fri, May 14, 2010 at 1:11 PM, Joe Korty joe.korty@ccur.com wrote:
Promote heap sizing to first-class Kconfig citizenship.
Changing the heap size is something that those, like me, with large PCI device trees need to do. ?Therefore heap size should appear as a normal, user-answerable question within the Kconfig build system.
I think the difference here is that you're a developer (not a user) once you start touching the code. Users shouldn't have to worry about the heap size. It should be set larger in your mainboard Kconfig if the mainboard needs more heap space.
Some background: The reason I'm looking at coreboot is that standard BIOSes (apparently) run out of memory while doing the bus walk, when I plug a PCI-e expansion chassis into the motherboard and populate it. The BIOS will either lock up or the OS will boot but what the OS sees for a PCI Bus (via lspci -tv) is clearly corrupt.
So my job was/is to do an experiment to see if our problems are indeed due to out-of-memory issues in standard BIOSes, and if so, if coreboot could be a useful way around this issue.
And indeed, the first time I booted coreboot with a populated PCI-e chassis attached, I got an out-of-memory halt from coreboot. Increasing CONFIG_HEAP_SIZE to 0x10000 (ie, 4x) got the system to boot, and lspci -tv looks good also. I have yet to try intermediate values.
Unfortunately we have an even bigger PCI-e loaded expansion chassis (configuration #2), for which coreboot also hangs. It's not an out-of-memory hang; it happens (apparently) during the bus walk. I haven't looked into this hang in detail yet, so I don't have much to report. But I do fear it may be something more fundamental.
Regards, Joe
On Fri, May 14, 2010 at 1:49 PM, Joe Korty joe.korty@ccur.com wrote:
On Fri, May 14, 2010 at 03:19:45PM -0400, Myles Watson wrote:
On Fri, May 14, 2010 at 1:11 PM, Joe Korty joe.korty@ccur.com wrote:
Promote heap sizing to first-class Kconfig citizenship.
Changing the heap size is something that those, like me, with large PCI device trees need to do. ?Therefore heap size should appear as a normal, user-answerable question within the Kconfig build system.
I think the difference here is that you're a developer (not a user) once you start touching the code. Users shouldn't have to worry about the heap size. It should be set larger in your mainboard Kconfig if the mainboard needs more heap space.
Some background: The reason I'm looking at coreboot is that standard BIOSes (apparently) run out of memory while doing the bus walk, when I plug a PCI-e expansion chassis into the motherboard and populate it. The BIOS will either lock up or the OS will boot but what the OS sees for a PCI Bus (via lspci -tv) is clearly corrupt.
I wonder if that could be partly due to the ACPI implementation too.
So my job was/is to do an experiment to see if our problems are indeed due to out-of-memory issues in standard BIOSes, and if so, if coreboot could be a useful way around this issue.
And indeed, the first time I booted coreboot with a populated PCI-e chassis attached, I got an out-of-memory halt from coreboot. Increasing CONFIG_HEAP_SIZE to 0x10000 (ie, 4x) got the system to boot, and lspci -tv looks good also. I have yet to try intermediate values.
It seems like you have a pretty specific special case. Maybe we should create a CONFIG_EXTRA_HEAP that depends on CONFIG_EXPERT that lets you add heap.
Unfortunately we have an even bigger PCI-e loaded expansion chassis (configuration #2), for which coreboot also hangs. It's not an out-of-memory hang; it happens (apparently) during the bus walk. I haven't looked into this hang in detail yet, so I don't have much to report. But I do fear it may be something more fundamental.
Sounds like fun.
Thanks, Myles
On Fri, May 14, 2010 at 03:56:00PM -0400, Myles Watson wrote:
It seems like you have a pretty specific special case.
:) From my point of view, large systems are the standard case and normal desktops are the oddballs.....
Regards, Joe
Sounds like fun.
It's been educational and mind-stretching and worth it just for that.
Joe, we have visited this type of issue from time to time. The heap size, if it is related to a mainboard (and it is) belongs in the mainboard Kconfig and should not be user-visible. The reason is that if it is visible then that visibility implies that it can be safely changed, much as the baud rate can be safely changed. That is clearly wrong: many values of heap size will result in a locked up platform.
Thus, heap size can be set in mainboard kconfig, but should not be user visible.
As for your pci problem, I suspect it's not out of memory for the original bios but a bug in the bios or the hardware itself. We've had lots of chipset/pci card combinations over the years that confused the bioses, badly. It just happens.
thanks
ron
On Fri, May 14, 2010 at 05:38:04PM -0400, ron minnich wrote:
Joe, we have visited this type of issue from time to time. The heap size, if it is related to a mainboard (and it is) belongs in the mainboard Kconfig and should not be user-visible. The reason is that if it is visible then that visibility implies that it can be safely changed, much as the baud rate can be safely changed. That is clearly wrong: many values of heap size will result in a locked up platform.
Hi Ron, Thanks for the update. I haven't had any problems increasing heap size but that could just be my motherboard.
What failure modes become possible when the heap size is increased?
Joe
On Fri, May 14, 2010 at 7:18 PM, Joe Korty joe.korty@ccur.com wrote:
What failure modes become possible when the heap size is increased?
suppose someone for whatever reason sets it to a preposterous size. Not likely but we've seen that sort of thing happen.
it's not necessary to have it user visible, and that alone is a good reason not to put it there.
ron
On Fri, May 14, 2010 at 05:38:04PM -0400, ron minnich wrote:
Joe, we have visited this type of issue from time to time. The heap size, if it is related to a mainboard (and it is) belongs in the mainboard Kconfig and should not be user-visible. The reason is that if it is visible then that visibility implies that it can be safely changed, much as the baud rate can be safely changed. That is clearly wrong: many values of heap size will result in a locked up platform.
Hi Ron, Thanks for the update. I haven't had any problems increasing heap size but that could just be my motherboard.
It's easy to run out of RAM if you increase it too much, especially if the stack gets too large. For most boards, stack*processors + heap + code = 1M.
The bigger worry is that someone will decrease the RAM size, which is a boot-time failure. Build-time failures are easier to handle.
Thanks, Myles
On Fri, May 14, 2010 at 1:49 PM, Joe Korty joe.korty@ccur.com wrote:
Some background: The reason I'm looking at coreboot is that standard BIOSes (apparently) run out of memory while doing the bus walk, when I plug a PCI-e expansion chassis into the motherboard and populate it. The BIOS will either lock up or the OS will boot but what the OS sees for a PCI Bus (via lspci -tv) is clearly corrupt.
So my job was/is to do an experiment to see if our problems are indeed due to out-of-memory issues in standard BIOSes, and if so, if coreboot could be a useful way around this issue.
And indeed, the first time I booted coreboot with a populated PCI-e chassis attached, I got an out-of-memory halt from coreboot. Increasing CONFIG_HEAP_SIZE to 0x10000 (ie, 4x) got the system to boot, and lspci -tv looks good also. I have yet to try intermediate values.
Could you try the latest? Devices now take ~ 1/4 the space that they used to take.
Unfortunately we have an even bigger PCI-e loaded expansion chassis (configuration #2), for which coreboot also hangs. It's not an out-of-memory hang; it happens (apparently) during the bus walk. I haven't looked into this hang in detail yet, so I don't have much to report. But I do fear it may be something more fundamental.
If you send the log to the list we might be able to help.
Thanks, Myles
On Fri, May 21, 2010 at 12:10:51PM -0400, Myles Watson wrote:
On Fri, May 14, 2010 at 1:49 PM, Joe Korty joe.korty@ccur.com wrote:
Unfortunately we have an even bigger PCI-e loaded expansion chassis (configuration #2), for which coreboot also hangs. It's not an out-of-memory hang; it happens (apparently) during the bus walk. ?I haven't looked into this hang in detail yet, so I don't have much to report. ?But I do fear it may be something more fundamental.
If you send the log to the list we might be able to help.
Hi Myles, I've solved this one, kind of. It is PCI IO Space overflow, we are going over 0xffff which apparently is a hard limit. I image this is there so that inb, outw, etc instructions can be used to reference these devices.
But if one doesn't use such instructions (instead using memory mapped PCI IO space), I see no reason why Linux and coreboot couldn't work with PCI IO Space addresses
0xffff.
Regards, Joe
If you send the log to the list we might be able to help.
Hi Myles, I've solved this one, kind of. It is PCI IO Space overflow, we are going over 0xffff which apparently is a hard limit. I image this is there so that inb, outw, etc instructions can be used to reference these devices.
But if one doesn't use such instructions (instead using memory mapped PCI IO space), I see no reason why Linux and coreboot couldn't work with PCI IO Space addresses
0xffff.
The resource allocator doesn't care. Just find the places where the I/O flag is checked and the limit is set to 0xffff and try setting it larger. I would look in src/devices/pci_device.c and src/northbridge/your_northbridge/northbridge.c first.
I'm not sure what will break, but we should be able to fix it pretty easily.
Thanks, Myles
Hi Joe,
On 21.05.2010 22:07, Joe Korty wrote:
On Fri, May 21, 2010 at 12:10:51PM -0400, Myles Watson wrote:
On Fri, May 14, 2010 at 1:49 PM, Joe Korty joe.korty@ccur.com wrote:
Unfortunately we have an even bigger PCI-e loaded expansion chassis (configuration #2), for which coreboot also hangs. It's not an out-of-memory hang; it happens (apparently) during the bus walk. ?I haven't looked into this hang in detail yet, so I don't have much to report. ?But I do fear it may be something more fundamental.
If you send the log to the list we might be able to help.
I've solved this one, kind of. It is PCI IO Space overflow, we are going over 0xffff which apparently is a hard limit. I image this is there so that inb, outw, etc instructions can be used to reference these devices.
But if one doesn't use such instructions (instead using memory mapped PCI IO space), I see no reason why Linux and coreboot couldn't work with PCI IO Space addresses
0xffff.
I'm interested in how you want to map port IO space to memory. Please explain.
AFAIK PCI register space is totally independent of port IO space which is totally independent of memory space. You can access PCI register space via CF8/CFC port IO and via MMCONFIG memory, but I'm unaware of any mechanisms to map IO ports to memory or the other way round.
Thanks, Carl-Daniel
On Fri, May 21, 2010 at 05:04:26PM -0400, Carl-Daniel Hailfinger wrote:
On Fri, May 21, 2010 at 12:10:51PM -0400, Myles Watson wrote:
On Fri, May 14, 2010 at 1:49 PM, Joe Korty joe.korty@ccur.com wrote:
I've solved this one, kind of. It is PCI IO Space overflow, we are going over 0xffff which apparently is a hard limit. I image this is there so that inb, outw, etc instructions can be used to reference these devices.
But if one doesn't use such instructions (instead using memory mapped PCI IO space), I see no reason why Linux and coreboot couldn't work with PCI IO Space addresses
0xffff.
I'm interested in how you want to map port IO space to memory. Please explain.
AFAIK PCI register space is totally independent of port IO space which is totally independent of memory space. You can access PCI register space via CF8/CFC port IO and via MMCONFIG memory, but I'm unaware of any mechanisms to map IO ports to memory or the other way round.
Well, all I know at this point is that the Linux kernel sources have code that maps inb etc either to the instructions or to a memory dereference, and the .config for that chooses memory dereference for x86.
It's gonna be fun seeing if high-IO-address space can be made to work..
Regards, Joe
I'm interested in how you want to map port IO space to memory. Please explain.
AFAIK PCI register space is totally independent of port IO space which is totally independent of memory space. You can access PCI register space via CF8/CFC port IO and via MMCONFIG memory, but I'm unaware of any mechanisms to map IO ports to memory or the other way round.
Well, all I know at this point is that the Linux kernel sources have code that maps inb etc either to the instructions or to a memory dereference, and the .config for that chooses memory dereference for x86.
It's gonna be fun seeing if high-IO-address space can be made to work..
We should be able to support any mapping that you can make work in Linux. It will be fun to see.
Thanks, Myles