Stefan Reinauer stepan@openbios.org writes:
- jason schildt jschildt@lnxi.com [050903 00:03]:
+/* We can reduce the size of code generated by romcc by
- changing all of the fixed size types that live in registers
- into simple unsigned variables. (ie s/uint8_t/unsigned/g)
- */
Why is this? I would consider specifying an 8bit type to be more space-safing than using some generic untyped integer value. If not this should be fixed in romcc..
This is a fundamental limit, especially on 32bit x86 with it's non-symmetric registers. romcc allocates registers and registers are not 8 bits. Therefore it requires an extra operation to mask the register value to be 8 bits, after the operation.
Theoretically it could help by allowing use of registers such as %ah but the problem is that you cannot perform a register to register between register combinations like %esi, %ah and it gets even worse when you include the mmx and sse registers. So %ah is essentially unusable.
Since using smaller values does not increase the numbers of registers you can use and using smaller registers requires an extra mask step. Using smaller values increases the code size.
The comment was added to document this fact so we can revisit this later, if it becomes important. Hopefully this begins dispelling the myth that sub word sized quantities are more efficient to use.
If you are really into register savings bit-fields can help. Especially when you have more than 2 values in a register. You have to pack and unpack the values but if you don't have them all unpacked simultaneously it can help.
A sub word type is only slightly better than a bit-field in that you can use the register directly. But it still requires maintenance work to keep from having anything more than a sign bit in the registers high bits.
/* AMD K8 Unsupported 1Ghz? */ if (id == (PCI_VENDOR_ID_AMD | (0x1100 << 16))) {
if (is_cpu_pre_e0()) // CK804 support 1G?
device_t dev_2 = PCI_DEV(0,0x18,2);
if(pci_read_config32(dev_2,0x9c) < 0x20f00) {
The function call looks a lot more readable here. How much is the gain of manual inlining here?
100% The call actually works. cpuid requires 4 registers and we don't have that many to spare at this point in the code. What this bit does is read a cached copy of the cpu rev from a scratch register in pci configuration space. Probably the clearest thing to have would be a set of functions that perform this test. Something like is_cached_cpu_pre_e0(). Almost as good was be a good comment.
Eric