Right, I was hoping for some way to have this happen "automatically" so mistakes can't creep in down the road.
Let me summarize the summary: the issue is that Linux is evaluating _CRS for PNP0A08 devices (and presumably others) and then removing e820 reserved regions from anything that is advertised in _CRS. This is a workaround Linux has kept for years to deal with boot firmware that includes both host bridge registers and windows into PCI memory space in _CRS; this is the workaround that is planned to be removed in 2023.
On Mon, Aug 22, 2022 at 7:46 PM Lance Zhao lance.zhao@gmail.com wrote:
We will need to define the "hidden" entry in host _crs to be match with E820 "reserved" entry? They may cause some manual work, maybe we can have some code change to make it automatically?
Tim Wawrzynczak via coreboot coreboot@coreboot.org 于2022年8月23日周二 04:27写道:
Hello fellow coreboot folks,
I recently received this message from some associates in the Linux kernel community which describes a bug in coreboot firmware for many Intel boards. The gist of it related to inconsistencies between the hostbridge's _CRS method in ACPI and the E820 table (although technically generated by the payload, the e820 table is generally, AFAIK, converted straight from the LB_TAG_MEMORY entry in the coreboot table by most payloads).
I have some thoughts on how to address this, but I would like to hear the community's thoughts first.
Thanks, -Tim
---------- Forwarded message --------- From: Bjorn Helgaas helgaas@kernel.org Date: Thu, Aug 11, 2022 at 12:30 PM Subject: Issues with ACPI _CRS and E820 memory map To: linux-pci@vger.kernel.org Cc: Hans de Goede hdegoede@redhat.com, Myron Stowe mstowe@redhat.com, Kai-Heng Feng kai.heng.feng@canonical.com, Andy Shevchenko < andriy.shevchenko@linux.intel.com>, linux-acpi@vger.kernel.org, < linux-kernel@vger.kernel.org>
This is a heads-up about what I think is a firmware defect in the way some platforms build _CRS methods. We've had a Linux workaround for several years, but the workaround breaks some new machines, so the workaround will be disabled for 2023 and newer machines.
Machines that depend on the workaround include:
- Dell Precision T3500
- Lenovo ThinkPad X1 Gen 2
- Asus C523NA (Coral) Chromebook
- Likely any machine using coreboot firmware
The current versions of the machines above work fine, but 2023 versions with similar firmware are likely to break unless the firmware changes. Please forward this to any firmware folks who may be able to help with this issue.
Bjorn
SUMMARY
A Linux change will break future platforms that rely on the E820 memory map to exclude portions of the PCI host bridge windows reported by ACPI _CRS methods.
Linux discovers PCI host bridge MMIO windows by evaluating the _CRS method of the ACPI PNP0A03 device that describes the host bridge. It uses these windows to assign address space to PCI BARs.
In some cases these _CRS methods are incomplete or incorrect, and it's hard for an OS to work around this.
Below are examples of typical problems with _CRS methods.
PLATFORMS REPORT NON-WINDOW SPACE VIA _CRS
Sometimes _CRS includes host bridge register space or space assigned to hidden PCI devices that are not enumerable by the OS. When an OS assigns this space to PCI devices, it may cause conflicts or devices may not work. This appears to be a firmware defect.
Many platforms report this non-window space as "reserved" in the E820 memory map, and since 2010, Linux has worked around the _CRS defect by excluding these E820 "reserved" regions from the host bridge MMIO windows [4].
Example 1:
_CRS includes space that's not usable for PCI devices [1]: E820: [mem 0xdceff000-0xdfa0ffff] reserved PNP0A08 _CRS: [mem 0xdfa00000-0xfebfffff] Note that [mem 0xdfa00000-0xdfa0ffff] is included in both the E820 entry and _CRS. If Linux assigns [mem 0xdfa00000-0xdfbfffff] to a PCI device, the system doesn't resume correctly from suspend. If Linux avoids the [mem 0xdfa00000-0xdfa0ffff] area and instead assigns [mem 0xdfb00000-0xdfcfffff], resume works correctly.
Example 2:
_CRS includes space assigned to a "hidden" PCI device [2, 5]: PCI: 00:0d.0 10 base d0000000 limit d0ffffff mem (fixed) # BIOS log E820: [mem 0xd0000000-0xd0ffffff] PNP0A08 _CRS: [mem 0x80000000-0xe0000000] The 00:0d.0 device is assigned the [mem 0xd0000000-0xd0ffffff] space, but the device is hidden so the OS cannot enumerate it, so the OS doesn't know what space the device consumes.
PLATFORMS SUPPLY E820 ENTRIES COVERING ENTIRE _CRS WINDOWS
Some recent platforms supply E820 "reserved" regions that cover entire PCI host bridge windows. If Linux excludes these E820 regions from the windows, it cannot assign space to PCI BARs, which means hot-added devices don't work.
Example 3:
E820 has a "reserved" region that completely covers the 32-bit MMIO window from _CRS [3]: E820: [mem 0x4bc50000-0xcfffffff] reserved PNP0A08 _CRS: [mem 0x65400000-0xbfffffff] Historically, Linux has avoided putting PCI devices in E820 reserved regions to avoid the problems in examples 1 and 2. Avoiding those regions in this case means Linux can't assign space for hot-added devices, so they don't work.
LINUX PLANS
As far as I know, the ACPI spec does not require an OS to exclude space from _CRS resources based on the E820 memory map, and these conflicting requirements make it impractical for Linux to do so.
The "avoid E820 regions" workaround worked for several years, but it no longer works because of platforms that advertise E820 regions that cover *entire* _CRS windows.
We plan to make Linux stop excluding E820 regions from _CRS resources for platforms with a BIOS date of 2023 or newer, so new platforms or new BIOS releases that rely on excluding E820 regions may break [6].
Linux is likely to be broken on future versions of these platforms unless the firmware updates _CRS methods.
If these platforms do not update _CRS methods to be complete and accurate, Linux may not boot. The user's options are to:
- Manually boot with a kernel command line option like "pci=use_e820". - Wait for an updated kernel with a platform-specific workaround.
WHY DOESN'T THIS AFFECT MICROSOFT WINDOWS?
Short answer: I suspect it *does*, but it's less likely to be a problem on Windows.
As far as I know, Windows does not exclude MMIO space from _CRS based on the E820 memory map. But Windows allocates PCI BARs from the top down, while Linux allocates from the bottom up. Most of the issues happen with space at the bottom of the _CRS MMIO windows, so Linux is more likely to trip over them than Windows is.
REFERENCES
[1] https://bugzilla.redhat.com/show_bug.cgi?id=2029207 [2] https://lore.kernel.org/linux-pci/4e9fca2f-0af1-3684-6c97-4c35befd5019@redha... [3] https://bugzilla.kernel.org/show_bug.cgi?id=206459 [4] https://bugzilla.kernel.org/show_bug.cgi?id=16228 [5] https://review.coreboot.org/plugins/gitiles/coreboot/+/dbcf7b16219d%5E%21/ [6] https://git.kernel.org/linus/0ae084d5a674 _______________________________________________ coreboot mailing list -- coreboot@coreboot.org To unsubscribe send an email to coreboot-leave@coreboot.org