Hello Paul,
Few observations:
[1] Lenovo X60t is 10.5 years old laptop, and I am really surprised that Kernels series 4.x still support it. I'll call it: miracle! http://www.tabletpcreview.com/tabletreview/lenovo-thinkpad-x60-tablet-pc-rev...
[2] It is obvious (at least to me) what is happening to your Lenovo X60t: your DRAM is exhausted, thus no possible to add anymore Physical Table Entries, since no physical memory available in your system (this is the result of unknown root cause, which I'll try to discover from incomplete log you had posted): 12.951: [ 350.898287] BUG: unable to handle kernel paging request at f8281008 12.951: [ 350.898346] IP: gen2_write32+0x62/0x130 [i915] 12.951: [ 350.898347] *pde = 34c88067 *12.951: [ 350.898349] *pte = 00000000*
[3] Here is probable closer clue to root cause for [2]: 12.951: [ 350.898429] CPU: 1 PID: 1113 Comm:* kworker/u4:32* Tainted: G E 4.11.0-rc7 #23 12.951: [ 350.898431] Hardware name: LENOVO 636338U/636338U, BIOS CBET4000 4.5-1596-gccdb801 04/19/2017 12.951: [ 350.898436] Workqueue: events_unbound async_run_entry_fn
32 is killing number, my best guess is that you have 32 k-processes calling 32 system calls at once (maybe the same syscall)!?
[4] I will suggest to run as root ps -elf (capture traces of it), top, htop, and to read the following open net thread: https://askubuntu.com/questions/33640/kworker-what-is-it-and-why-is-it-hoggi... (especially: *Why does kworker hog your CPU? To find out why a kworker is wasting your CPU, you can create CPU backtraces: watch your processor load (with top or something) and in moments of high load through kworker, execute echo l > /proc/sysrq-trigger to create a backtrace. (On Ubuntu, this needs you to login with sudo -s*).
Good Luck! Zoran _______
On Thu, Apr 20, 2017 at 9:35 PM, Paul Menzel via coreboot < coreboot@coreboot.org> wrote:
Dear coreboot folks,
With Linux 4.11-rcX sometimes the Lenovo X60t doesn’t correctly resume anymore. I still have to do some more tests, but I believe I am unable to reproduce this issue with Linux 4.10.8. But as I also do not know how to reproduce it, despite doing suspend and resuming, and see how it goes, I cannot know for sure. With Linux 4.11-rcX, I’d say the issue happens one out of ten or 15 times.
12.951: [ 350.898287] BUG: unable to handle kernel paging request at f8281008 12.951: [ 350.898346] IP: gen2_write32+0x62/0x130 [i915] 12.951: [ 350.898347] *pde = 34c88067 12.951: [ 350.898349] *pte = 00000000 12.951: [ 350.898349] 12.951: [ 350.898352] Oops: 0002 [#1] SMP 12.951: [ 350.898353] Modules linked in: joydev(E) wacom_w8001(E) serport(E) cpufreq_powersave(E) cpufreq_conservative(E) cpufreq_userspace(E) iTCO_wdt(E) iTCO_vendor_support(E) acpi_cpufreq(E) coretemp(E) arc4 (E) kvm_intel(E) lpc_ich(E) kvm(E) mfd_core(E) irqbypass(E) iwl3945(E) iwlegacy(E) i915(E) evdev(E) snd_hda_codec_analog(E) snd_hda_codec_generic(E) snd_pcsp(E) pcmcia(E) serio_raw(E) mac80211(E) rng_core(E) snd _hda_intel(E) yenta_socket(E) pcmcia_rsrc(E) drm_kms_helper(E) pcmcia_core(E) snd_hda_codec(E) snd_hda_core(E) snd_hwdep(E) thinkpad_acpi(E) snd_pcm(E) drm(E) cfg80211(E) battery(E) nvram(E) i2c_algo_bit(E) snd_ timer(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) snd(E) rfkill(E) sysimgblt(E) soundcore(E) video(E) ac(E) button(E) shpchp(E) tpm_tis(E) tpm_tis_core(E) tpm(E) fuse(E) parport_pc(E) 12.951: [ 350.898397] ppdev(E) lp(E) parport(E) autofs4(E) ext4(E) crc16(E) jbd2(E) fscrypto(E) mbcache(E) ecb(E) cbc(E) algif_skcipher(E) af_alg(E) dm_crypt(E) dm_mod(E) sg(E) sr_mod(E) sd_mod(E) cdrom(E) ata _generic(E) sdhci_pci(E) psmouse(E) sdhci(E) uhci_hcd(E) e1000e(E) ata_piix(E) ahci(E) ehci_pci(E) firewire_ohci(E) libahci(E) i2c_i801(E) libata(E) ehci_hcd(E) ptp(E) mmc_core(E) firewire_core(E) crc_itu_t(E) s csi_mod(E) usbcore(E) pps_core(E) thermal(E) 12.951: [ 350.898429] CPU: 1 PID: 1113 Comm: kworker/u4:32 Tainted: G E 4.11.0-rc7 #23 12.951: [ 350.898431] Hardware name: LENOVO 636338U/636338U, BIOS CBET4000 4.5-1596-gccdb801 04/19/2017 12.951: [ 350.898436] Workqueue: events_unbound async_run_entry_fn 12.951: [ 350.898438] task: f2588480 task.stack: f258c000 12.951: [ 350.898483] EIP: gen2_write32+0x62/0x130 [i915] 12.951: [ 350.898484] EFLAGS: 00210282 CPU: 1 12.951: [ 350.898486] EAX: f8281008 EBX: f8d80eb0 ECX: 00000001 EDX: f8180000 12.951: [ 350.898487] ESI: f658403c EDI: 00000001 EBP: f258de44 ESP: f258de14 12.951: [ 350.898488] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 12.951: [ 350.898490] CR0: 80050033 CR2: f8281008 CR3: 17589000 CR4: 000006d0 12.951: [ 350.898492] Call Trace: 12.951: [ 350.898537] ? i915_gem_restore_gtt_mappings+0x1ca/0x290 [i915] 12.951: [ 350.898581] ? gen9_decoupled_read64+0x270/0x270 [i915] 12.951: [ 350.898622] ? gen6_ggtt_invalidate+0x25/0x30 [i915] 12.951: [ 350.898665] ? i915_gem_resume+0x2d/0x80 [i915] 12.951: [ 350.898701] ? i915_drm_resume+0x38/0x180 [i915] 12.952: [ 350.898705] ? pci_pm_resume+0x4b/0xc0 12.952: [ 350.898709] ? dpm_run_callback+0x53/0x150 12.952: [ 350.898713] ? wait_for_completion+0x2a/0x140 12.952: [ 350.898716] ? pci_pm_thaw+0x90/0x90 12.952: [ 350.898718] ? device_resume+0x87/0x170 12.952: [ 350.898720] ? async_resume+0x1e/0x50 12.952: [ 350.898722] ? async_run_entry_fn+0x35/0x190 12.952: [ 350.898726] ? process_one_work+0x15f/0x3a0 12.952: [ 350.898729] ? worker_thread+0x39/0x470 12.952: [ 350.898732] ? kthread+0xdb/0x110 12.952: [ 350.898734] ? process_one_work+0x3a0/0x3a0 12.952: [ 350.898736] ? kthread_create_on_node+0x30/0x30 12.952: [ 350.898739] ? ret_from_fork+0x1c/0x28 12.952: [ 350.898741] Code: d4 a6 e1 f8 bb 02 00 00 00 89 4c 24 08 89 5c 24 04 c7 04 24 49 d1 e0 f8 e8 bc 92 c8 ff 8b 97 d4 03 00 00 8b 45 e8 8b 7d e4 01 d0 <89> 38 83 c4 24 5b 5e 5f 5d c3 8d 74 26 00 64 8b 15 00 31 57 d7 12.952: [ 350.898813] EIP: gen2_write32+0x62/0x130 [i915] SS:ESP: 0068:f258de14 12.952: [ 350.898814] CR2: 00000000f8281008 12.952: [ 350.898817] ---[ end trace 478b15034b0b3e6a ]---
Ticket #100739 in the Freedesktop.org Bugzilla [1] tracks this issue.
Chris Wilson replied already.
The stacktrace is mostly garbage. A register mmio goes wrong. One of the suggestions is that the ioremap of the PCI bar is invalid upon resume.
coreboot has the TPM patches applied, and is built with native graphics initialization.
Has anybody else experienced any issues on the Lenovo X60t?
In #coreboot@irc.freenode.net it was mentioned that there might be low memory corruptions on the Intel 945 devices.
I’d welcome help also from “normal users” to reproduce this issue.
Kind regards,
Paul
[1] https://bugs.freedesktop.org/show_bug.cgi?id=100739
coreboot mailing list: coreboot@coreboot.org https://mail.coreboot.org/mailman/listinfo/coreboot