sorry for this rather long post. it's an update on the generic cpu detection and initialization source. this update makes it a lot easier to understand what i'm trying to achive (hello Stefan ;)). there are still some essential things missing but that's merely due to unavailable information. if there are questions left don't bother asking!
microcode loading could be nicely inserted into the switch-statement in configure_cpu() in config_intel.c this is where model-specific things should go, too.
next thing i'm going to write is the L2 cache initalization. to understand the idea of my L2 cache initialization routine take a look at
IntelĀ® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide, Part 2
page B.155ff for reference. i'll be calling the registers by the names given in the reference. according to my understanding the following steps have to be taken for a generic P6 L2 cache algorithm:
1. read IA32_PLATFORM_ID to obtain L2 Cache Latency and Clock Frequency Ratio (we can save Platform Id, too. we'll need it for the microcode update!) 2. read BBL_CR_CTL3 and check L2 Hardware Disable. if set the L2 cache is hardware disabled. nothing we can do about that, so just exit the routine. 3. set L2 Physical Address Range, L2 Enabled, L2 Cache Latency; clear L2 Configured and write BBL_CR_CTL3 back 4. fill the L2 cache with 0 and mark the whole cache invalid by using BBL_CR_ADDR, BBL_CR_D[0:3], BBL_CR_CTL and the table pointer(s) from query_cache_size() 5. read BBL_CR_CTL3, set L2 Configured and write BBL_CR_CTL3 back 6. done :)
unanswered questions: - what happens if there's more than one L2 cache type found? shouldn't happen on the P6 architecture, but what about Core/Core2/Nehalem? - L2 cache pointer array in cpu.h has a fixed size of 8. on the P6 architecture we would only need one pointer because they only have one L2 cache array per core. not sure what we need for Core/Core2/Nehalem. the routines writing to the array (in query_cache_size()) don't check for any boundaries. this is potentially dangerous. is there a need to care about this? - is there a default value for L2 Physical Address Range? 4GB? - what about L2 ECC? i've no idea how it's intended to work. - some intel cpus support verbose processor identification strings (like Intel(R) Pentium(R) 4 CPU 1500MHz) others use a brand index table from which a verbose identification string can be build. do we need this? we could also display family, model, stepping and some additional information (cache size, cache speed and cpu core speed). should we display anything at all? - is there a need to know the cpu core multiplier and the system bus frequency? they are on EBL_CR_POWERON. - what about L1 cache? do we need to know its size to display it? what about L3 cache? i've currently no information on configuring L3 cache.
what i currently need: - values of msr IA32_PLATFORM_ID, EBL_CR_POWERON and BBL_CR_CTL3 when booted with coreboot (e.g. untouched). does inteltool support dumping msrs? Uwe, can you report back with those values?
anyone with L2/L3 cache register information on anything newer than the intel P6 architecture? i guess there's a document on L2/L3 cache detection and init available from intel. at least if you are AWARD/Phoenix or AMI.
/* * (C) 2009 Holger Hesselbarth */
#include <config_intel.h> #include <identify_cpu.h> #include <msr.h> #include <cpu.h>
static struct _cache_entry L2_cache_table[] = { { 0x21, 256, 8, 64, 1}, { 0x41, 128, 4, 32, 1}, { 0x42, 256, 4, 32, 1}, { 0x43, 512, 4, 32, 1}, { 0x44, 1024, 4, 32, 1}, { 0x45, 2048, 4, 32, 1}, { 0x48, 3072, 12, 64, 1}, { 0x49, 4096, 16, 64, 1}, /* L3 cache for Intel Xeon MP (family 0Fh, model 06h) */ { 0x4E, 6144, 24, 64, 1}, { 0x78, 1024, 4, 64, 1}, { 0x79, 128, 8, 64, 2}, { 0x7A, 256, 8, 64, 2}, { 0x7B, 512, 8, 64, 2}, { 0x7C, 1024, 8, 64, 2}, { 0x7D, 2048, 8, 64, 1}, { 0x7F, 512, 2, 64, 1}, { 0x80, 512, 8, 64, 1}, { 0x82, 256, 8, 32, 1}, { 0x83, 512, 8, 64, 1}, { 0x84, 1024, 8, 64, 1}, { 0x85, 2048, 8, 64, 1}, { 0x86, 512, 4, 64, 1}, { 0x87, 1024, 8, 64, 1}, { 0, 0, 0, 0, 0 } };
static void configure_cpu(struct cpu_data *cpu) { switch(cpu->family) { /** * Pentium * Pentium MMX * Pentium Mobile * Pentium MMX Mobile (Tillamook) */ case 0x05 : break;
case 0x06 : /* get cache types and sizes */ query_cache_size(cpu);
/* everything before Core technology */ if(cpu->efamily == 0) { /* do model specific initialization */ switch(cpu->model) { /** * Pentium Pro */ case 0x00 : case 0x01 : /* omitted break intended */
/** * Pentium II (Klamath) */ case 0x03 : /* omitted break intended */
/** * Pentium II (Deschutes) * Pentium II Xeon (Drake) * Mobile Pentium II (Dixon) * Celeron (Covington) */ case 0x05 : /* omitted break intended */
/** * Mobile Pentium II * Celeron (Mendochino) * Mobile Celeron (Mendochino) */ case 0x06 : /* omitted break intended */
/** * Pentium III (Katmai) * Pentium III Xeon (Tanner) */ case 0x07 : /* omitted break intended */
/** * Pentium III (Coppermine) * Pentium III-M (Coppermine) * Pentium III Xeon (Coppermine) * Mobile Pentium III * Celeron (Coppermine) */ case 0x08 : /* initialize L2 cache */ if(cpu->L2[0]->descriptor != 0) { initialize_P6_L2_cache(cpu); } break;
/** * Pentium M (Banias) * Celeron M (Banias) */ case 0x09 : /* initialize_M_L2_cache */ break;
/** * Pentium III Xeon (Cascades) */ case 0x0A : /* omitted break intended */
/** * Pentium III (Tualatin) * Pentium III-S (Tualatin) * Mobile Pentium III (Tualatin) * Celeron (Tualatin) */ case 0x0B : /* initialize L2 cache */ if(cpu->L2[0]->descriptor != 0) { initialize_P6_L2_cache(cpu); } break;
/** * Pentium M (Dothan) * Celeron M (Dothan) */ case 0x0D : /* initialize_M_L2_cache */ break;
/** * Unknown model */ default : break; } }
/* everything based on Core technology and later */ else { /* do model specific initialization */ switch(((cpu->emodel << 4) & cpu->model)) { /** * Core Solo * Core Duo */ case 0x0E : /* initialize_Core_L2_cache */ break;
/** * Xeon Processor 3000, 3200, 5100, 5300, 7300 * Core 2 Quad * Core 2 Extreme * Core 2 Duo * Pentium dual-core */ case 0x0F : /* initialize_Core2_L2_cache */ break;
/** * Xeon Processor 5200, 5400 * Core 2 Quad Q9650 */ case 0x17 : /* initialize_Core_L2_cache */ break;
/** * Core i7 */ case 0x1A : /* initialize_Nehalem_L2_cache */ break;
/** * Atom */ case 0x1C : /* initialize_Atom_L2_cache */ break;
/** * Xeon Processor MP 7400 */ case 0x1D : /* initialize_???_L2_cache */ break;
/** * Unknown model */ default : break; } } break;
/** * Itanium */ case 0x07 : break;
/** * all processors base on the Pentium 4 core */ case 0x0F : break;
/* Unknown cpu family */ default : break; }
return; };
static void query_cache_size(struct cpu_data *cpu) { unsigned char i, j, k, m; unsigned int reg[4]; unsigned char iterations; unsigned char cache_id;
/* get cache and tlb information */ if(cpu->maxbi >= 2) { cpuid(CPU_CACHE_INF, ®[0], ®[1], ®[2], ®[3]); /* extract how often to call cpuid with CPU_CACHE_INF */ iterations = (unsigned char)(reg[0] & 0xFF); m = 0;
for(i=0; i < iterations; i++) { cpuid(CPU_CACHE_INF, ®[0], ®[1], ®[2], ®[3]); /* bit 31 set means invalid register */ for(j=0; j<3; j++) { if(reg[j] & 0x80000000) { /* clear invalid register */ reg[j] = 0; } }
/* iterate through all cache identifiers */ for(j=1; j<16; j++) { cache_id = (reg[j / 4] >> (unsigned int)(8 * (j % 4))); if(cache_id != 0) { k = 0; /* L2 cache */ while(L2_cache_table[k].descriptor != 0) { if(L2_cache_table[k].descriptor == cache_id) { cpu->L2[m] = &L2_cache_table[k]; m++; } k++; } } } } } return; };
static void initialize_P6_L2_cache(struct cpu_data *cpu) { unsigned char L2_cache_latency; };
/* * (C) 2001 Dave Jones. * 2009 adapted for coreboot by Holger Hesselbarth * * Licensed under the terms of the GNU GPL License version 2. */
static int flag_is_changeable_p(unsigned long flag) { unsigned long f1, f2; __asm__ volatile("pushf\n\t" "pushf\n\t" "pop %0\n\t" "mov %0,%1\n\t" "xor %2,%0\n\t" "push %0\n\t" "popf\n\t" "pushf\n\t" "pop %0\n\t" "popf\n\t" : "=&r" (f1), "=&r" (f2) : "ir" (flag)); return ((f1^f2) & flag) != 0; }
int check_cpuid(void) { return flag_is_changeable_p(0x200000); }
/* * (C) 2009 Holger Hesselbarth */
#include <cpu.h> #include <identify_cpu.h>
static void identify_cpu(struct cpu_data *cpu) { /* check if cpuid is supported */ if((check_cpuid()) == 0) { // TODO : check if this is a Cyrix cpu with cpuid disabled return; }
/* gather basic cpu information */ get_basic_cpu_information(cpu);
/* gather extended cpu information */ get_extended_cpu_information(cpu);
return; };
static void get_basic_cpu_information(struct cpu_data *cpu) { unsigned int maxbi, vendor; unsigned int eax, ebx, ecx, edx;
/* get basic cpu information */ cpuid(CPU_BASIC_INF, &maxbi, &vendor, NULL, NULL); cpu->maxbi = (maxbi & 0x0000FFFF ); /* high-order word is non-zero on some Cyrix cpus */
if(cpu->maxbi < 1) { return; }
/* get family, model, stepping and feature flags */ cpuid((CPU_BASIC_INF + 1), &eax, &ebx, &ecx, &edx); cpu->stepping = (eax & 0x0F); cpu->model = ((eax >> 4) & 0x0F); cpu->family = ((eax >> 8) & 0x0F); cpu->ecx_bi_flags = ecx; cpu->edx_bi_flags = edx; /* get more basic cpu information based on vendor */ switch(vendor) { // TODO : could also be GenuineTMx86 for Transmeta CPUs case 0x756e6547 : /* uneG */ cpu->vendor = X86_VENDOR_INTEL; cpu->type = ((eax >> 12) & 0x03);
if(cpu->family == (unsigned int)0x0F) { cpu->emodel = ((eax >> 16) & 0x0F); cpu->efamily = ((eax >> 20) & 0xFF); } else if(cpu->family == (unsigned int)0x06) { cpu->emodel = ((eax >> 16) & 0x0F); }
cpu->brand = (ebx & 0xFF); cpu->clflush = ((ebx >> 8) & 0xFF); cpu->apicid = ((ebx >> 24) & 0xFF); break; case 0x68747541 : /* thuA */ cpu->vendor = X86_VENDOR_AMD; // TODO : check for extended family & model break; case 0x69727943 : /* iryC */ cpu->vendor = X86_VENDOR_CYRIX; break; case 0x746e6543 : /* tneC */ cpu->vendor = X86_VENDOR_CENTAUR; break; // TODO : add more cpu makers default : cpu->vendor = X86_VENDOR_UNKNOWN; break; }
return; };
static void get_extended_cpu_information(struct cpu_data *cpu) { /* get extended cpu information * the IDT WinChip C6 needs a special treatment */ if((cpu->vendor == X86_VENDOR_CENTAUR) && (cpu->model == 4)) { cpuid(CPU_C6_EXT_INF, &(cpu->maxei), NULL, NULL, NULL); } else { cpuid(CPU_EXT_INF, &(cpu->maxei), NULL, NULL, NULL); }
return; };
#ifndef CPU_CPU_H #define CPU_CPU_H
//#include <arch/cpu.h>
#define CPU_NAME_LEN 80 #define L2_CACHE_ENTRIES 8
struct _cache_entry { unsigned char descriptor; unsigned int size; unsigned char associativity; unsigned int line_size; unsigned char lines_per_sector; };
struct cpu_data { struct cpudata *next; unsigned char number; unsigned char vendor;
unsigned char family; unsigned char model; unsigned char type; unsigned char stepping; unsigned char efamily; unsigned char emodel;
unsigned char platform_id; unsigned char maxbi; unsigned int maxei;
unsigned int ecx_bi_flags; unsigned int edx_bi_flags;
struct _cache_entry *L2[L2_CACHE_ENTRIES]; unsigned int cachesize_L1_I; unsigned int cachesize_L1_D; unsigned int cachesize_L2; unsigned int cachesize_L3; unsigned int cachesize_trace;
char name[CPU_NAME_LEN];
unsigned int nr_cores; unsigned int nr_logical; /* Intel specific bits */ unsigned char brand; unsigned char clflush; unsigned char apicid; /* Intel Pentium III serial number (PSN) */ unsigned int serialno[2]; };
struct device; struct bus;
void cpu_initialize(void); void initialize_cpus(struct bus *cpu_bus);
#define __cpu_driver __attribute__ ((used,__section__(".rodata.cpu_driver"))) /** start of compile time generated pci driver array */ extern struct cpu_driver cpu_drivers[]; /** end of compile time generated pci driver array */ extern struct cpu_driver ecpu_drivers[];
#endif /* CPU_CPU_H */
#ifndef CPU_X86_CONFIG_INTEL_H #define CPU_X86_CONFIG_INTEL_H
#include <cpu.h>
#define CPU_CACHE_INF 0x00000002 #define IA32_PLATFORM_ID 0x00000017 #define EBL_CR_POWERON 0x0000002A #define BBL_CR_D0 0x00000088 #define BBL_CR_D1 0x00000089 #define BBL_CR_D2 0x0000008A #define BBL_CR_D3 0x0000008B #define BBL_CR_ADDR 0x00000116 #define BBL_CR_CTL 0x00000119 #define BBL_CR_TRIG 0x0000011A #define BBL_CR_BUSY 0x0000011B #define BBL_CR_CTL3 0x0000011E
static void configure_cpu(struct cpu_data *cpu); static void initialize_P6_L2_cache(struct cpu_data *cpu); static void query_cache_size(struct cpu_data *cpu);
#endif /* CPU_X86_CONFIG_INTEL_H */
#ifndef CPU_X86_IDENTIFY_CPU_H #define CPU_X86_IDENTIFY_CPU_H
#include <cpu.h>
#define CPU_BASIC_INF 0x00000000 #define CPU_EXT_INF 0x80000000 #define CPU_C6_EXT_INF 0xC0000000
extern int check_cpuid(void);
static void identify_cpu(struct cpu_data *cpu); static void get_basic_cpu_information(struct cpu_data *cpu); static void get_extended_cpu_information(struct cpu_data *cpu);
#endif /* CPU_X86_IDENTIFY_CPU_H */
Holger Hesselbarth wrote:
does inteltool support dumping msrs?
If it doesn't already, allow me to promote msrtool which does nothing but decode MSRs. Please feel free to submit a patch with some register definitions. So far there are target drivers supporting Geode LX and CS5536 that you could use for reference.
Thanks!
//Peter
On 01.02.2009, at 02:32, Peter Stuge peter@stuge.se wrote:
Holger Hesselbarth wrote:
does inteltool support dumping msrs?
If it doesn't already, allow me to promote msrtool which does nothing but decode MSRs. Please feel free to submit a patch with some register definitions. So far there are target drivers supporting Geode LX and CS5536 that you could use for reference.
Inteltool can dump the msrs but it will not parse the bit values the fine way msrtool does.
Maybe we should remove the intel focus from inteltool and merge the two utilities, call it systemtool or chipsettool instead? Stuff like dumping pmbase might be interesting for non-intel systems too, while intel systems could gain from a more detailed msr dumper... Thoughts?
Stefan
Thanks!
//Peter
-- coreboot mailing list: coreboot@coreboot.org http://www.coreboot.org/mailman/listinfo/coreboot
Stefan Reinauer wrote:
Inteltool can dump the msrs but it will not parse the bit values the fine way msrtool does.
Ok. Maybe you can send a patch to add those MSRs to msrtool Holger?
Have a look at http://stuge.se/mt.cs5536_pic_divil3.patch for one way to note references in the source files.
Maybe we should remove the intel focus from inteltool and merge the two utilities, call it systemtool or chipsettool instead? Stuff like dumping pmbase might be interesting for non-intel systems too, while intel systems could gain from a more detailed msr dumper... Thoughts?
I've had thoughts in that direction too. TODO mentions handling PCI and port IO registers as well, ie. turn the tool into a more generic register decoder. I kind of like that idea. On the other hand then it will very clearly start competing with prettyprint[1] that the Google guys showed us. It looks like a somewhat fat implementation though, which I'm not thrilled about. But they have a nifty user interface, FUSE is a fun twist. I don't know.
We were discussing GeodeLink routing on IRC the other day, and how it would be nice to e.g. create a bus graph by piecing together values from various registers, but I'm not sure that kind of complexity should go into msrtool. Or maybe it should? But a generic user interface will be a bit tricky.. msrtool is already option intense.
//Peter
These ideas sound great!
now we need a cool name and a cool picture to go with it :-)
ron
Ok. Maybe you can send a patch to add those MSRs to msrtool Holger?
Have a look at http://stuge.se/mt.cs5536_pic_divil3.patch for one way to note references in the source files.
Hi Peter, no problem, i'll set up a file for the P6 cores as soon as possible. i saw your presentation at the 25C3 so i knew that there's something like a register dumping tool. is there a svn repository to get the tool and creating patches against?
at the moment i can not test if any of my code is really working. i'm in vienna right now and i have to stay a month or two. i had to leave all my linux machines in germany and i don't want to install too much non-work related stuff on my employer's laptop. all i can check is if the code compiles to objects (e.g. is free of syntax errors). but still i won't stop suppling code and hoping that Uwe (or someone else!) will compile and test some it before i can :)