Youness Alaoui posted comments on this change.

View Change

Patch set 8:

Actually it was correct in the earlier patch versions. I just
forgot about
the rpci_* when restructuring it. Should be fixed now, haven't
retested yet.

Actually it wasn't correct in earlier patches. While I apparently commented on patchset 5, looking at my tree, I had actually tested patchset 1, so I already had the cleanup done in a shutdown handler.

Anyways, I tested the latest code and I still get the crash. I also moved the register for shutdown to the end of the function to make sure the handler gets registered *after* the rcpi_* handler gets registered, hoping it would mean the pci_access wouldn't get cleaned up until after the rcpi callback is called, but it didn't help.

Like I said before, the issue is that the accessor is set to -1 for some reason...

... and I just figured out why.. the machine I'm testing on is using pciutils 3.5.2, but the machine I'm building on is using pciutils 3.3.0. The structure changed (abi change but not api change?) and since the accessor is a 'used internally' its position in the structure changed.. when i print its value, it's actually printing the "numa_mode" field which is set to -1.

Anyways, the problem here is that all the libpci APIs take/return a pci_dev pointer, so whichever version of the library you use, the allocations and field accesses from within the library are safe, and the 'non-internal' fields are also safe to use. The issue here is that flashrom/pcidev.c does a copy of the structure as is, but not as a pointer, so the structure has the wrong size and when it's given to a libpci API function, it causes the segfault.

So basically, whatever you do, it will always crash because the pci_dev structure has the wrong size in pcidev.c. I'm going to assume that it would still crash regardless of the ich100 code. I think I didn't see it before because on BDW I was using flashrom from the apt repository instead of compiling it myself on a different computer/system.

This is the diff I had locally which fixed the crash for me : https://ghostbin.com/paste/6hd26 (this was against patchset 1)
Storing the pci_acc in dev->access is because trying to access it doesn't work (it's seen as -1) and without it, I get the error of can't find /sys/class/pci.../1f.5/...

I didn't want to show this patch before because it felt like an ugly hack, and that's why I said before "I don't know how it should be fixed", but I think it's correct, at least, partially, we need to store a pci_dev pointer instead of a structure in the rcpi callback data, if we use the pci_get_dev, then we lose the accessor, if we store the pointer as given to us, then we risk using a freed device when the callback is called.

My suggestion is, what if instead of creating a new pci_access and setting its method, what if in the case of ich100, we simply change the method of the global 'pacc' ? Will it affect anything? will pci devices get created or the accessor needed after we identify it as pch100 ? If not, then it would be safe to change the method of the global pci_access, and that would resolve the rcpi_ issue if we use pci_get_dev and simply use pacc.

To view, visit change 18925. To unsubscribe, visit settings.

Gerrit-Project: flashrom
Gerrit-Branch: staging
Gerrit-MessageType: comment
Gerrit-Change-Id: I000819aff25fbe9764f33df85f040093b82cd948
Gerrit-Change-Number: 18925
Gerrit-PatchSet: 8
Gerrit-Owner: Nico Huber <nico.h@gmx.de>
Gerrit-Reviewer: David Hendricks <david.hendricks@gmail.com>
Gerrit-Reviewer: Nico Huber <nico.h@gmx.de>
Gerrit-Reviewer: Paul Menzel <paulepanter@users.sourceforge.net>
Gerrit-Reviewer: Stefan Reinauer <stefan.reinauer@coreboot.org>
Gerrit-Reviewer: Stefan Tauner <stefan.tauner@gmx.at>
Gerrit-Reviewer: Youness Alaoui <snifikino@gmail.com>
Gerrit-Reviewer: build bot (Jenkins) <no-reply@coreboot.org>
Gerrit-Comment-Date: Thu, 13 Jul 2017 19:12:46 +0000
Gerrit-HasComments: No