[LinuxBIOS] Proposed change to hypertransport.c for better error handling

San Mehat san at google.com
Fri Oct 21 17:19:21 CEST 2005


Hey all,

It appears that under certain situations / hardware, HT can come up with the
LinkFail and CrcError bits set on certain devices, even though the bus isn't
*currently* in an error state. This causes 'hypertransport_scan_chain()' to
stop traversing down a chain. I've made the following patch which knocks
down the error state and re-reads to identify if the error is transient or
not (It also reports the error rather than silently aborts the chain scan
which caused me about 6 hours of hunting to find):

*****BEGIN CUT*****
Index: hypertransport.c
===================================================================
--- hypertransport.c (revision 2064)
+++ hypertransport.c (working copy)
@@ -345,12 +345,25 @@
/* Wait until the link initialization is complete */
do {
ctrl = pci_read_config16(prev.dev, prev.pos + prev.ctrl_off);
- /* Is this the end of the hypertransport chain?
- * Has the link failed?
- * If so further scanning is pointless.
- */
- if (ctrl & ((1 << 6) | (1 << 4))) {
- goto end_of_chain;
+
+ if (ctrl & (1 << 6))
+ goto end_of_chain; // End of chain
+
+ if (ctrl & ((1 << 4) | (1 << 8))) {
+ /*
+ * Either the link has failed, or we have
+ * a CRC error.
+ * Sometimes this can happen due to link
+ * retrain, so lets knock it down and see
+ * if its transient
+ */
+ ctrl |= ((1 << 6) | (1 <<8)); // Link fail + Crc
+ pci_write_config16(prev.dev, prev.pos + prev.ctrl_off, ctrl);
+ ctrl = pci_read_config16(prev.dev, prev.pos + prev.ctrl_off);
+ if (ctrl & ((1 << 4) | (1 << 8))) {
+ printk_alert("Detected error on Hypertransport Link\n");
+ goto end_of_chain;
+ }
}
} while((ctrl & (1 << 5)) == 0);

****END CUT*****
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.coreboot.org/pipermail/coreboot/attachments/20051021/c67a5ef4/attachment.html>


More information about the coreboot mailing list