Hi all,
I think I've noticed a general linux-BIOS problem: Some PCI devices are not being enabled for bus mastery! Let me more clearly explain the context:
1) According to my handy PCI book: All bus master capable PCI devices must implement bit 2 of the command register. 2) Most devices that implement this bit power up default to 1. This is good, because linuxbios has not been setting this bit to 1. 3) A user reported a problem with a 802.11b card, and upon investigation I discovered the chip used on this card does not default to a 1 for bus master enable. (See attached lspci output)
After some testing, I realized that the same card on a non LinuxBIOS based machine works correctly - i.e. by the time the kernel is up I see that this bus master enable bit is 1. I know that the Award BIOS on this second machine must have set this bus master enable bit - because the kernel and drivers are identical. I verified this by dumping the appropriate config registers early on in kernel start-up.
My theory is that the Award BIOS is doing something that we are not: During the IO/Mem space assignment process it is setting the bus master bit for most devices. By "most devices" I mean any device that is not behind a bridge that inhibits bus mastery. My PCI Sys Arch book (by Shanley and Anderson) says that some crummy bridges may not support bus mastery (i.e. they have a bit 2 of the command register that always reads as zero). If a device is behind a bridge that does not support mastery, it must have its mastery bit cleared.
Does anyone have a copy of the $$$ PCI architecture spec so they can check this? My PCI System Architecture book is vague on the topic.
My proposed fix:
* Add pci_enable_bus_masters() that does the following: * Calls pci_enable_bus_master(0) to enable devices directly connected to the host bridge
* Add pci_enable_bus_master(devfn) that does the following: * Try to enable bus mastery for this device. * If the device is a bridge: * If the write of master enable fails, then write zero to all "bus master enable" bits for devices behind that bridge (including all subordinate buses) * If the write of master enable succeeds, then call pci_enable_bus_master for all devices on the secondary bus behind this bridge. (i.e. not for all subordinate busses)
I've added these functions and called them after the standard linuxbios PCI setup. The problem card now works as well as it did with Award BIOS. What to ya'll think? I'm particularly interested in quotes from the PCI architecture spec.
Kevin
---
On a linux BIOS machine:
00:14.0 Network controller: Harris Semiconductor: Unknown device 3873 (rev 01) Subsystem: D-Link System Inc: Unknown device 3501 Flags: medium devsel, IRQ 5 Memory at feb00000 (32-bit, prefetchable) [size=4K] Capabilities: [dc] Power Management version 2 00: 60 12 73 38 02 00 90 02 01 00 80 02 08 40 00 00 10: 08 00 b0 fe 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 11 01 35 30: 00 00 00 00 dc 00 00 00 00 00 00 00 05 01 00 00 40: 80 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 01 00 02 7e e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
On a non linux BIOS machine:
00:0a.0 Network controller: Harris Semiconductor: Unknown device 3873 (rev 01) Subsystem: D-Link System Inc: Unknown device 3501 Flags: bus master, medium devsel, latency 32, IRQ 10 Memory at ee000000 (32-bit, prefetchable) [size=4K] Capabilities: [dc] Power Management version 2 00: 60 12 73 38 07 00 90 02 01 00 80 02 08 20 00 00 10: 08 00 00 ee 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 11 01 35 30: 00 00 00 00 dc 00 00 00 00 00 00 00 0a 01 00 00 40: 80 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 01 00 02 7e e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
On Tue, 2003-01-14 at 09:32, Kevin Hester wrote:
Hi all,
I think I've noticed a general linux-BIOS problem: Some PCI devices are not being enabled for bus mastery! Let me more clearly explain the context:
- According to my handy PCI book: All bus master capable PCI devices must
implement bit 2 of the command register. 2) Most devices that implement this bit power up default to 1. This is good, because linuxbios has not been setting this bit to 1. 3) A user reported a problem with a 802.11b card, and upon investigation I discovered the chip used on this card does not default to a 1 for bus master enable. (See attached lspci output)
According to PCI Spec Rev. 2.2: Bit 2: Controls a device's ability to act as a master on the PCI bus. A value of 0 diable the device from generating PCI accesses A value of 1 allows the device to behave as a bus master. State after RST# is 0.
So, if bit 2 of a device has power up default value of 1, it is implemented incorrectly.
After some testing, I realized that the same card on a non LinuxBIOS based machine works correctly - i.e. by the time the kernel is up I see that this bus master enable bit is 1. I know that the Award BIOS on this second machine must have set this bus master enable bit - because the kernel and drivers are identical. I verified this by dumping the appropriate config registers early on in kernel start-up.
It is a bug of the the device driver. Take a look at other well behaved drivers (like sis900.c). The driver have to call pci_set_master by itself during the driver init phase.
My theory is that the Award BIOS is doing something that we are not: During the IO/Mem space assignment process it is setting the bus master bit for most devices. By "most devices" I mean any device that is not behind a bridge that inhibits bus mastery. My PCI Sys Arch book (by Shanley and Anderson) says that some crummy bridges may not support bus mastery (i.e. they have a bit 2 of the command register that always reads as zero). If a device is behind a bridge that does not support mastery, it must have its mastery bit cleared.
If the bridge can not forward bus master cycle, there is virtually no way for the bus master devices behind it to work properly. Those devices does not have both "pio" and "dma" modes generally.
Does anyone have a copy of the $$$ PCI architecture spec so they can check this? My PCI System Architecture book is vague on the topic.
The spec is vague on this too.
we don't turn on bus master as that could be very hazardous to your health -- imagine an unitialized PCI device coming up with bus master enabled. It is at that point allowed to do DMA cycles to RAM without having been initialized by a driver. OUCH.
In my opinion if the driver is not turning on bus master it is a buggy driver. If the device comes up with bus master enabled it is a buggy device. Ollie has pointed this out too. There's a lot of buggy PCI hardware in existence.
I think the Award BIOS is buggy, possibly intentionally, to deal with buggy drivers (there are lots of BIOS patches that are in there, I am told, to fix buggy device drivers in Windows).
I would recommend fixing the 802.11 driver, rather than modifying LinuxBIOS. But let's see what other people say.
ron
"Ronald G. Minnich" rminnich@lanl.gov writes:
we don't turn on bus master as that could be very hazardous to your health -- imagine an unitialized PCI device coming up with bus master enabled. It is at that point allowed to do DMA cycles to RAM without having been initialized by a driver. OUCH.
In my opinion if the driver is not turning on bus master it is a buggy driver. If the device comes up with bus master enabled it is a buggy device. Ollie has pointed this out too. There's a lot of buggy PCI hardware in existence.
I think the Award BIOS is buggy, possibly intentionally, to deal with buggy drivers (there are lots of BIOS patches that are in there, I am told, to fix buggy device drivers in Windows).
I am not totally certain that it is buggy. There are places this can be reasonable behavior. Experience with etherboot shows that normally boot devices at least have bus master set by the BIOS. Though not often enough to do something reasonable with it.
I would recommend fixing the 802.11 driver, rather than modifying LinuxBIOS. But let's see what other people say.
Agreed. Not all BIOSes enable bus mastering on all networking hardware. So changing the LinuxBIOS behavior will not necessarily help.
Eric
On 14 Jan 2003, Eric W. Biederman wrote:
I am not totally certain that it is buggy. There are places this can be reasonable behavior. Experience with etherboot shows that normally boot devices at least have bus master set by the BIOS.
I can see how they would do this, but I still find it worrisome. I would guess that if the BIOS knows about a particular device it sets bus master, but it is hard to say. I would still not want to take standard BIOS behavior as a rule, given how many things that standard BIOSes seem to get wrong :-)
ron
Smart. I'll fix the driver rather than doing the Award work around. Ollie, thanks for checking the spec.
On Tuesday 14 January 2003 06:54, Ronald G. Minnich wrote:
we don't turn on bus master as that could be very hazardous to your health -- imagine an unitialized PCI device coming up with bus master enabled. It is at that point allowed to do DMA cycles to RAM without having been initialized by a driver. OUCH.
In my opinion if the driver is not turning on bus master it is a buggy driver. If the device comes up with bus master enabled it is a buggy device. Ollie has pointed this out too. There's a lot of buggy PCI hardware in existence.
I think the Award BIOS is buggy, possibly intentionally, to deal with buggy drivers (there are lots of BIOS patches that are in there, I am told, to fix buggy device drivers in Windows).
I would recommend fixing the 802.11 driver, rather than modifying LinuxBIOS. But let's see what other people say.
ron