On Sat, May 31, 2014 at 12:18:32PM -0400, Kevin O'Connor wrote:
Change the multi-processor init code to trampoline
into 32bit mode on
each of the additional processors. Implement an atomic lock so that
each processor performs its initialization serially.
Signed-off-by: Kevin O'Connor <kevin(a)koconnor.net>
Changed since v2:
* Use "lock btsl" instead of "lock cmpxchgl" as suggested by
* Enable CPU caching on the APs
* Report the apic_id in debug messages for each AP
FYI, I did try with the BSP loop in assembler (see below), but it made
the assembler slightly more complex. In theory, the comparison in
assembler would result in less lock contention, but in practice it
doesn't seem to matter. (With a small number of APs it's fast
regardless, and with a large number of APs the contention comes from
all the APs so the addition of the BSP contention is small.)
diff --git a/src/fw/smp.c b/src/fw/smp.c
index 51c0cae..bc5ebaf 100644
@@ -123,17 +123,19 @@ smp_setup(void)
// Wait for other CPUs to process the SIPI.
u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
- while (cmos_smp_count != CountCPUs)
// Release lock and allow other processors to use the stack.
" movl %%esp, %1\n"
" movl $0, %0\n"
// Reacquire lock and take back ownership of stack.
"1:rep ; nop\n"
+ " cmp %3, %2\n"
+ " jne 1b\n"
" lock btsl $0, %0\n"
" jc 1b\n"
: "+m" (SMPLock), "+m" (SMPStack)
- : : "cc", "memory");
+ : "r" (cmos_smp_count), "m" (CountCPUs)
+ : "cc", "memory");
// Restore memory.