On Sun, Aug 05, 2012 at 12:21:13PM +0100, Matthew Millman wrote:
Hi
I'm seeing a rather interesting problem with UHCI on Intel US15W and wondered if anyone else had seen anything like this before. I noticed it when I plugged in a USB keyboard, which caused a crash due to something corrupting the stack? it turns out that the stack has been trashed by the UHCI controller via DMA?!
When trying to transmit the 8 byte address setup packet, the hardware doesn't quite seem to be doing as it's told. SeaBIOS sets up the UHCI TDs exactly as per the spec - no problems there,
Once the QH element is set, instead of transmitting the 8 bytes as described in the TD, it transmits a full 1023 bytes? (according to the returned TD) UHCI then goes ahead and overwrites another 35 bytes beyond the end of the buffer pointed to by the TD.
Here's the 8 bytes of the setup packet (I've set everything after it to 0xFF):
1fbc1f95: 00 05 01 00 00 00 00 00 ff ff ff 1fbc1fa0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 1fbc1fb0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 1fbc1fc0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 1fbc1fd0: ff ff ff ff ff
Here it is after the UHCI controller has been at it. The only code to execute between these two dumps is this:
pipe->qh.element = (u32)&tds[0]; (in uhci_control())
1fbc1f95: 00 05 01 00 00 00 00 00 ff ff ff 1fbc1fa0: bf 00 05 01 00 00 00 00 00 ff ff ff fd 03 00 00 1fbc1fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 1fbc1fc0: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 1fbc1fd0: ff ff ff ff ff
TD Chain before: 1fbc4870: 84 48 bc 1f 00 00 80 1c 2d 00 e0 00 95 1f bc 1f 1fbc4880: 01 00 00 00 00 00 80 04 69 00 e8 ff 00 00 00 00
TD Chain after: 1fbc4870: 84 48 bc 1f ff 07 80 1c 2d 00 e0 00 95 1f bc 1f 1fbc4880: 01 00 00 00 00 00 80 04 69 00 e8 ff 00 00 00 00
My read of the spec says an actlen=0x07ff means a null transfer (not 1023 bytes). However, given that the status is still active I don't think it really matters what's in the td.
I'm wondering if I'm not the first person to have seen this. The problem (without detailed debugging) manifests its self exactly as described in this message:
I haven't seen this type of report before. A couple of things you could try: dump the USB controller registers as well (the controller may have shutdown for a different reason), check to see if any other transfer attempted to use 0x1fbc1fa0 in the past (perhaps the controller has something stale cached), look for an errata for the chipset, look through the linux code for the chipset to see if it is working about something, try aligning the setup packet buffer to 16 bytes.
-Kevin