While we're on the 'error handling' track, I also had an issue where 'ht_optimize_link()' was continually indicating that a reset was required and causing an endless reset loop. The culprit was the code which performs an 8 bit read to get the *current* frequency of the link and doesn't mask off the error bits next to the frequency bits (bits 0-4 are frequency, and bits 5-7 are error status).
Once again, this was triggered by a transient error on the link (likely due to link retraining).
Heres the patch:
****START CUT**** Index: incoherent_ht.c =================================================================== --- incoherent_ht.c (revision 2064) +++ incoherent_ht.c (working copy)
@@ -198,12 +206,12 @@ freq = log2(freq_cap1 & freq_cap2);
/* See if I am changing the link freqency */ - old_freq = pci_read_config8(dev1, pos1 + LINK_FREQ(offs1)); + old_freq = (pci_read_config8(dev1, pos1 + LINK_FREQ(offs1)) & 0x0f); // Mask off error bits needs_reset |= old_freq != freq; - old_freq = pci_read_config8(dev2, pos2 + LINK_FREQ(offs2)); + old_freq = (pci_read_config8(dev2, pos2 + LINK_FREQ(offs2)) & 0x0f); // Mask off error bits needs_reset |= old_freq != freq;
- /* Set the Calulcated link frequency */ + /* Set the Calculated link frequency */ pci_write_config8(dev1, pos1 + LINK_FREQ(offs1), freq); pci_write_config8(dev2, pos2 + LINK_FREQ(offs2), freq); ***END CUT****