We use a PLX chip in our design and I will say they are incredibly sensitive to reset. Are you sure your voltages are holding up during startup? The fact that it works with a limited number of GPU’s would be indicative of voltage sag due to the higher inrush current. A slow rise time on the reset signal can cause a lot of devices to not come out of reset properly. Same with the reset line de-asserting before all the voltages were stable.
Brett
From: coreboot [mailto:coreboot-bounces@coreboot.org] On Behalf Of Adam Talbot Sent: Thursday, January 04, 2018 12:39 PM To: coreboot@coreboot.org Subject: [coreboot] 16 GPUs on one board
** The Sender is from outside the Cobham Commercial Datacentre organisation ** -Coreboot I am totally off the deep end and don't know where else to turn for help/advice. I am trying to get 16 GPU's on one motherboard. Whenever I attach more then 3~5 GPU's to a single motherboard, it fails to post. To make matters worse, my post code reader(s) don't seem to give me any good error codes. Or at least nothing I can go on.
I am using PLX PEX8614 chips (PCIe 12X switch) to take 4 lanes and pass them to 8 GPU's, 1 lane per GPU. Bandwidth is not an issues as all my code runs native on the GPUs. Depending on the motherboard, I can get up to 5 GPU's to post. After many hours of debugging, googling, and trouble shooting, I am out of ideas.
At this point I have no clue. I think there is a hardware, and a BIOS component? Can you help me understand the post process and where the hang up is occurring? Do you think Coreboot will get around this hangup and, if so, can you advise a motherboard for me to test with?
Its been a long time sense I last compiled linuxbios. ;-)
Thanks -Adam