On Sat, May 31, 2014 at 12:18:32PM -0400, Kevin O'Connor wrote:
Change the multi-processor init code to trampoline into 32bit mode on each of the additional processors. Implement an atomic lock so that each processor performs its initialization serially.
Signed-off-by: Kevin O'Connor firstname.lastname@example.org
Changed since v2:
- Use "lock btsl" instead of "lock cmpxchgl" as suggested by Paolo.
- Enable CPU caching on the APs
- Report the apic_id in debug messages for each AP
FYI, I did try with the BSP loop in assembler (see below), but it made the assembler slightly more complex. In theory, the comparison in assembler would result in less lock contention, but in practice it doesn't seem to matter. (With a small number of APs it's fast regardless, and with a large number of APs the contention comes from all the APs so the addition of the BSP contention is small.)
diff --git a/src/fw/smp.c b/src/fw/smp.c index 51c0cae..bc5ebaf 100644 --- a/src/fw/smp.c +++ b/src/fw/smp.c @@ -123,17 +123,19 @@ smp_setup(void)
// Wait for other CPUs to process the SIPI. u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1; - while (cmos_smp_count != CountCPUs) asm volatile( // Release lock and allow other processors to use the stack. " movl %%esp, %1\n" " movl $0, %0\n" // Reacquire lock and take back ownership of stack. "1:rep ; nop\n" + " cmp %3, %2\n" + " jne 1b\n" " lock btsl $0, %0\n" " jc 1b\n" : "+m" (SMPLock), "+m" (SMPStack) - : : "cc", "memory"); + : "r" (cmos_smp_count), "m" (CountCPUs) + : "cc", "memory"); yield();
// Restore memory.