Richard Smith wrote:
In both cases its a somewhat off-color x86. I wonder if the emulator is making some assumption that isn't valid on these platforms.
I'm sorta convinced that it is becuase C0000 is not cached. Here is a simple test:
In this file: src/devices/pci_rom.c
see this code: if (PCI_CLASS_DISPLAY_VGA == rom_data->class_hi) { #if CONFIG_CONSOLE_VGA == 1 #if CONFIG_CONSOLE_VGA_MULTI == 0 if (dev != vga_pri) return NULL; // only one VGA supported #endif printk_debug("copying VGA ROM Image from 0x%x to 0x%x, 0x%x bytes\n", rom_header, PCI_VGA_RAM_IMAGE_START, rom_size); memcpy(PCI_VGA_RAM_IMAGE_START, rom_header, rom_size); vga_inited = 1; return (struct rom_header *) (PCI_VGA_RAM_IMAGE_START); #endif } else {
OK, let's fake it out. This is gross, but This IS A Test: add just 3 lines after the 'if':
if (PCI_CLASS_DISPLAY_VGA == rom_data->class_hi) { /* THIS IS A TEST */ static char image[64*1024]; #undef PCI_VGA_RAM_IMAGE_START #define PCI_VGA_RAM_IMAGE_START image /* THIS IS THE END OF THE CHANGES */ #if CONFIG_CONSOLE_VGA == 1 #if CONFIG_CONSOLE_VGA_MULTI == 0 if (dev != vga_pri) return NULL; // only one VGA supported #endif printk_debug("copying VGA ROM Image from 0x%x to 0x%x, 0x%x bytes\n", rom_header, PCI_VGA_RAM_IMAGE_START, rom_size); memcpy(PCI_VGA_RAM_IMAGE_START, rom_header, rom_size); vga_inited = 1; return (struct rom_header *) (PCI_VGA_RAM_IMAGE_START); #endif } else {
This is going to relocate the vga image to cached memory. This is OK! The emulator can use anything as memory -- it's an emulator.
I think this is ought to work. It's worth a try ...
Chris: I don't remember you reporting that the usermode emulator took that long. Will you please enable all the same debugging options and compare the run times.
yes, and the big difference: user-mode emulator just uses ram for the C segment ...
Another thing to consider is that the in-tree emulator may be running in a non-optimal enviroment where the user-mode emulator has had Linux booted to fix up the chipset.
I think it's caching. optimization should not be a 100x slowdown, in this case.
Let's try this if that's ok.
ron