Ben Frisch (bfrisch@xes-inc.com) just uploaded a new patch set to gerrit, which you can find at https://review.coreboot.org/12575
-gerrit
commit 577c9756918b0ef0f346c8bf4c60d49ceb9e39f1 Author: Benjamin Frisch bfrisch@xes-inc.com Date: Mon Nov 30 12:23:31 2015 -0600
device: Resolve hang when a PCI device requests 4GB or more of MMIO space
When a PCI device's BAR requests 4GB or more MMIO space coreboot assigns the device address 0. This results in the device overlapping DRAM and the boot hangs with no feedback explaining the issue. Resolve this hang by first detecting the exhaustion that available MMIO space under 4GB. Next, selectively ignore PCI device resource requests based on the algorithm that UEFI Firmware based on tianocore.org code does to accomplish the same.
This also meets requirements of the PCI Firmware Specification 3.1 and Firmware Allocation of PCI Device Resources in Windows document which states that firmware shall not allocate resources to devices that cannot fit in available 32-bit MMIO space. Modern 64-bit Linux and 64-bit Windows since Windows Vista will rebalance, allocate, and assign resources above the 4GB boundry to 64-bit PCI device MMIO BARs resources as necessary as long 64-bit MMIO space is defined in the ACPI _CRS method and the pci=realloc kernel argument is provided in Linux.
The determination of which large 64-bit PCI MMIO resource requests to ignore is based on how the exhaustion of available MMIO resources is handled by the tianocore.org UEFI Firmware implemenation. If a PCI MMIO resource request cannot be satisified, MMIO PCI resource requests of devices that are not on bus 0 are skipped in the following order to attempt to make room to satisfy the remaining PCI device MMIO space requests while keeping boot criticial devices enabled: 1) Resource requests of the current PCI device with the largest 64-bit MMIO resource request (without OPROMs) 2) Resource requests of the current PCI device with the largest 32-bit MMIO resource request (without OPROMs) 3) Resource requests of the current PCI device with the largest MMIO resource request (with OPROMs)
TEST=Tested on internal coreboot port based on older upstream with device presenting 128 GB BAR with latest Linux kernel, additional testing help and feedback is appreciated. Builds locally. Change-Id: I0217d627ca77431075df5e24d49d6406c35893b7 Signed-off-by: Ben Frisch bfrisch@xes-inc.com --- src/device/Kconfig | 46 ++++++++++++++++ src/device/device.c | 124 ++++++++++++++++++++++++++++++++------------ src/device/device_util.c | 109 ++++++++++++++++++++++++++++++++++++++ src/device/pci_device.c | 64 ++++++++++++++++------- src/include/device/device.h | 4 ++ 5 files changed, 295 insertions(+), 52 deletions(-)
diff --git a/src/device/Kconfig b/src/device/Kconfig index 0113545..b9a734d 100644 --- a/src/device/Kconfig +++ b/src/device/Kconfig @@ -389,6 +389,52 @@ config SOFTWARE_I2C I2C controller is not (yet) available. The platform code needs to provide bindings to manually toggle I2C lines.
+choice SET_MAX_MEM_BAR_SIZE + prompt "Set Maximum memory BAR size to try to allocate " + optional + help + Select the maximum size PCI device memory BAR that coreboot will try + to allocate. If a BAR is found with a values above this size, coreboot + will disable the device and will not try to allocate any BARs. + + If unsure, leave this disabled, and coreboot will find and disable any + devices that are too large to be allocated by itself. The + disadvantage is a slight boot time increase if a device has to be + disabled. + +config MAX_BAR_SIZE_256MB + bool "256 MB" + help + Set the maximum size PCI device memory BAR that coreboot will try to + allocate to 256 MB. If a BAR is found with a values above this size, + coreboot will disable the device and will not try to allocate any + BARs. + +config MAX_BAR_SIZE_512MB + bool "512 MB" + help + Set the maximum size PCI device memory BAR that coreboot will try to + allocate to 512 MB. If a BAR is found with a values above this size, + coreboot will disable the device and will not try to allocate any + BARs. + +config MAX_BAR_SIZE_1GB + bool "1 GB" + help + Set the maximum size PCI device memory BAR that coreboot will try to + allocate to 1 GB. If a BAR is found with a values above this size, + coreboot will disable the device and will not try to allocate any + BARs. + +endchoice + +config MAX_MEM_BAR_SIZE_TO_ALLOCATE + hex + default 0x10000000 if MAX_BAR_SIZE_256MB + default 0x20000000 if MAX_BAR_SIZE_512MB + default 0x40000000 if MAX_BAR_SIZE_1GB + default 0 + endmenu
menu "Display" diff --git a/src/device/device.c b/src/device/device.c index e23c9de..dc39dfc 100644 --- a/src/device/device.c +++ b/src/device/device.c @@ -690,7 +690,7 @@ static void constrain_resources(struct device *dev, struct constraints* limits) } }
-static void avoid_fixed_resources(struct device *dev) +static u8 avoid_fixed_resources(struct device *dev) { struct constraints limits; struct resource *res; @@ -743,9 +743,60 @@ static void avoid_fixed_resources(struct device *dev) if (res->flags & IORESOURCE_MEM) res->base = resource_max(res);
+ if (res->base < lim->base) { + /* This usually happens at the domain level, usually + * caused by a device having a BAR which is too big. + * Error out so we can retry. + */ + printk(BIOS_WARNING, + "ERROR: Resource Base was lower than its limit!\n\t"); + printk(BIOS_WARNING, + "%s@%02lx: base value 0x%08llx, lower limit 0x%08llx\n", + dev_path(dev), res->index, res->base, lim->base); + return 1; + } + printk(BIOS_SPEW, "%s:@%s %02lx base %08llx limit %08llx\n", __func__, dev_path(dev), res->index, res->base, res->limit); } + + return 0; +} + +static u8 avoid_all_fixed_resources(struct device *root) +{ + struct device *child; + + /* For all domains. */ + for (child = root->link_list->children; child; child = child->sibling) { + if (child->path.type == DEVICE_PATH_DOMAIN) { + if (avoid_fixed_resources(child)) { + printk( + BIOS_SPEW, + "Error trying to avoid fixed resources."); + + /* Now find the device which is the largest + * and disable it. + */ + struct device *largest = find_pci_dev_max_mem(); + + printk(BIOS_SPEW, "\tDisabling Device %s\n", + dev_path(largest)); + if (largest) + largest->enabled = 0; + + /* Clear out all devices on all domains, ready + * for re-scanning. + */ + printk(BIOS_SPEW, + "\tClearing old resource computations.\n"); + clear_computed_resources(); + return 1; + } + } + } + + return 0; }
device_t vga_pri = 0; @@ -1022,45 +1073,54 @@ void dev_configure(void)
root = &dev_root;
- /* - * Each domain should create resources which contain the entire address - * space for IO, MEM, and PREFMEM resources in the domain. The - * allocation of device resources will be done from this address space. - */ + do { + /* + * Each domain should create resources which contain the entire + * address space for IO, MEM, and PREFMEM resources in the + * domain. The allocation of device resources will be done from + * this address space. + */
- /* Read the resources for the entire tree. */ + /* Read the resources for the entire tree. */
- printk(BIOS_INFO, "Reading resources...\n"); - read_resources(root->link_list); - printk(BIOS_INFO, "Done reading resources.\n"); + printk(BIOS_INFO, "Reading resources...\n"); + read_resources(root->link_list); + printk(BIOS_INFO, "Done reading resources.\n");
- print_resource_tree(root, BIOS_SPEW, "After reading."); + print_resource_tree(root, BIOS_SPEW, "After reading.");
- /* Compute resources for all domains. */ - for (child = root->link_list->children; child; child = child->sibling) { - if (!(child->path.type == DEVICE_PATH_DOMAIN)) - continue; - post_log_path(child); - for (res = child->resource_list; res; res = res->next) { - if (res->flags & IORESOURCE_FIXED) - continue; - if (res->flags & IORESOURCE_MEM) { - compute_resources(child->link_list, - res, IORESOURCE_TYPE_MASK, IORESOURCE_MEM); - continue; - } - if (res->flags & IORESOURCE_IO) { - compute_resources(child->link_list, - res, IORESOURCE_TYPE_MASK, IORESOURCE_IO); + /* Compute resources for all domains. */ + for (child = root->link_list->children; + child; + child = child->sibling) { + if (!(child->path.type == DEVICE_PATH_DOMAIN)) continue; + post_log_path(child); + for (res = child->resource_list; res; res = res->next) { + if (res->flags & IORESOURCE_FIXED) + continue; + if (res->flags & IORESOURCE_MEM) { + compute_resources( + child->link_list, + res, + IORESOURCE_TYPE_MASK, + IORESOURCE_MEM); + continue; + } + if (res->flags & IORESOURCE_IO) { + compute_resources(child->link_list, + res, + IORESOURCE_TYPE_MASK, + IORESOURCE_IO); + continue; + } } } - }
- /* For all domains. */ - for (child = root->link_list->children; child; child=child->sibling) - if (child->path.type == DEVICE_PATH_DOMAIN) - avoid_fixed_resources(child); + /* Attempt to avoid all fixed resources, if this process + * succeeds then we can carry on, otherwise we should re-try. + */ + } while (avoid_all_fixed_resources(root));
/* Store the computed resource allocations into device registers ... */ printk(BIOS_INFO, "Setting resources...\n"); diff --git a/src/device/device_util.c b/src/device/device_util.c index ac18538..5d0ba50 100644 --- a/src/device/device_util.c +++ b/src/device/device_util.c @@ -921,3 +921,112 @@ int dev_count_cpu(void)
return count; } + +enum allocation_passes { + PASS_1_CHECK_64_BIT_BARS = 0, + PASS_2_CHECK_32_BIT_BARS = 1, + PASS_3_CHECK_DEVS_WITH_OPROMS = 2, + MAX_PASS = PASS_3_CHECK_DEVS_WITH_OPROMS +}; + +/** @brief finds the enabled, non root-bus, pci device using the most memory. + * + * @return pointer to the device + */ +struct device *find_pci_dev_max_mem(void) +{ + struct device *dev; + struct device *largest_dev = NULL; + struct resource *res; + uint64_t memory_used = 0; + uint64_t most_memory_used = 0; + uint8_t pass; + + for (pass = PASS_1_CHECK_64_BIT_BARS; pass <= MAX_PASS; pass++) { + + /* + * Loop through all pci devices that are not on bus 0, + * skipping them if they're disabled, checking to see + * which uses the most memory. + */ + for (dev = &dev_root; dev; dev = dev->next) { + + /* Skip disabled, root bus, and non-pci devices */ + if ((!(dev->path.type == DEVICE_PATH_PCI)) || + (dev->bus->secondary == 0) || + (dev->enabled == 0)) + continue; + + /* Total the memory used by the device */ + memory_used = 0; + for (res = dev->resource_list; res; res = res->next) { + + /* don't check 32-bit bars on the first pass */ + if (pass == PASS_1_CHECK_64_BIT_BARS && + ((res->flags & IORESOURCE_PCI64) == 0)) + continue; + + /* skip devices with option roms until last */ + if ((pass < PASS_3_CHECK_DEVS_WITH_OPROMS) + && (res->index == PCI_ROM_ADDRESS)) { + memory_used = 0; + break; + } + + if (res->flags & IORESOURCE_MEM) + memory_used += res->size; + } + + /* Save the largest device found so far */ + if (memory_used > most_memory_used) { + most_memory_used = memory_used; + largest_dev = dev; + } + } + + /* break out of the pass loop when we find a device */ + if (largest_dev != NULL) + break; + } + + return largest_dev; +} + +/** @brief clear all resources for a specified device. Leave index and next. + * + */ +void clear_device_resources(struct device *dev) +{ + struct resource *res; + + for (res = dev->resource_list; res; res = res->next) { + res->align = 0; + res->base = 0; + res->flags = 0; + res->gran = 0; + res->limit = 0; + res->size = 0; + } +} + +/** @brief clear all non-pnp resources. + * + */ +void clear_computed_resources(void) +{ + struct device *dev; + + /* + * Loop through all non-pnp devices, clearing everything except index + * and the pointer to the next resource. This keeps them from having + * to be re-allocated. + */ + for (dev = &dev_root; dev; dev = dev->next) { + + //skip pnp devices - their resources are set early + if (dev->path.type == DEVICE_PATH_PNP) + continue; + + clear_device_resources(dev); + } +} diff --git a/src/device/pci_device.c b/src/device/pci_device.c index 5123229..dfc290b 100644 --- a/src/device/pci_device.c +++ b/src/device/pci_device.c @@ -248,27 +248,49 @@ struct resource *pci_get_resource(struct device *dev, unsigned long index) resource->limit = 0xffff; } else { /* A Memory mapped base address. */ - attr &= PCI_BASE_ADDRESS_MEM_ATTR_MASK; - resource->flags |= IORESOURCE_MEM; - if (attr & PCI_BASE_ADDRESS_MEM_PREFETCH) - resource->flags |= IORESOURCE_PREFETCH; - attr &= PCI_BASE_ADDRESS_MEM_LIMIT_MASK; - if (attr == PCI_BASE_ADDRESS_MEM_LIMIT_32) { - /* 32bit limit. */ - resource->limit = 0xffffffffUL; - } else if (attr == PCI_BASE_ADDRESS_MEM_LIMIT_1M) { - /* 1MB limit. */ - resource->limit = 0x000fffffUL; - } else if (attr == PCI_BASE_ADDRESS_MEM_LIMIT_64) { - /* 64bit limit. */ - resource->limit = 0xffffffffffffffffULL; - resource->flags |= IORESOURCE_PCI64; + if (CONFIG_MAX_MEM_BAR_SIZE_TO_ALLOCATE && + (resource->size > CONFIG_MAX_MEM_BAR_SIZE_TO_ALLOCATE)) { + /* Disable devices that attempt to allocate a BAR beyond + the max configured size. */ + printk(BIOS_WARNING, + "Device at %s requested a 0x%llx byte BAR ", + dev_path(dev), + (unsigned long long)resource->size); + printk(BIOS_WARNING, + "for register 0x%02lx.\n", + index); + printk(BIOS_WARNING, + "The device exceeded largest allowed size of 0x%lx. ", + (unsigned long) + CONFIG_MAX_MEM_BAR_SIZE_TO_ALLOCATE); + printk(BIOS_WARNING, "Device disabled.\n"); + dev->enabled = 0; + clear_device_resources(dev); } else { - /* Invalid value. */ - printk(BIOS_ERR, "Broken BAR with value %lx\n", attr); - printk(BIOS_ERR, " on dev %s at index %02lx\n", - dev_path(dev), index); - resource->flags = 0; + attr &= PCI_BASE_ADDRESS_MEM_ATTR_MASK; + resource->flags |= IORESOURCE_MEM; + if (attr & PCI_BASE_ADDRESS_MEM_PREFETCH) + resource->flags |= IORESOURCE_PREFETCH; + attr &= PCI_BASE_ADDRESS_MEM_LIMIT_MASK; + if (attr == PCI_BASE_ADDRESS_MEM_LIMIT_32) { + /* 32bit limit. */ + resource->limit = 0xffffffffUL; + } else if (attr == PCI_BASE_ADDRESS_MEM_LIMIT_1M) { + /* 1MB limit. */ + resource->limit = 0x000fffffUL; + } else if (attr == PCI_BASE_ADDRESS_MEM_LIMIT_64) { + /* 64bit limit. */ + resource->limit = 0xffffffffffffffffULL; + resource->flags |= IORESOURCE_PCI64; + } else { + /* Invalid value. */ + printk(BIOS_ERR, + "Broken BAR with value %lx\n", + attr); + printk(BIOS_ERR, " on dev %s at index %02lx\n", + dev_path(dev), index); + resource->flags = 0; + } } }
@@ -344,6 +366,8 @@ static void pci_read_bases(struct device *dev, unsigned int howmany) (index < PCI_BASE_ADDRESS_0 + (howmany << 2));) { struct resource *resource; resource = pci_get_resource(dev, index); + if (dev->enabled == 0) + return; index += (resource->flags & IORESOURCE_PCI64) ? 8 : 4; }
diff --git a/src/include/device/device.h b/src/include/device/device.h index 62460ae..c857dd2 100644 --- a/src/include/device/device.h +++ b/src/include/device/device.h @@ -234,6 +234,10 @@ void fixed_mem_resource(device_t dev, unsigned long index, void scan_smbus(device_t bus); void scan_lpc_bus(device_t bus);
+struct device *find_pci_dev_max_mem(void); +void clear_computed_resources(void); +void clear_device_resources(struct device *dev); + /* It is the caller's responsibility to adjust regions such that ram_resource() * and mmio_resource() do not overlap. */