Hello,
I've gotten AIX 6/7 to instantiate RTAS (patches upcoming) and would like to trace what it's trying to do. I probably need to implement the display-character token.
The RTAS code in arch/ppc/qemu/start.S currently looks like this:
GLOBL(of_rtas_start): blr GLOBL(of_rtas_end):
...and I would like to branch to C code from there.
Is there a way to have code from, say, rtas.c go between the blr and of_rtas_end symbol? Or do I need to move the symbols to the ldscript and place the code in a special section? If yes, how?
Those symbols are being used for code size calculation and relocation in arch/ppc/qemu/methods.c.
Thanks, Andreas
On 08.10.2010, at 14:44, Andreas Färber wrote:
Hello,
I've gotten AIX 6/7 to instantiate RTAS (patches upcoming) and would like to trace what it's trying to do. I probably need to implement the display-character token.
As you're running in qemu, the gdbstub is very helpful at times.
The RTAS code in arch/ppc/qemu/start.S currently looks like this:
GLOBL(of_rtas_start): blr GLOBL(of_rtas_end):
...and I would like to branch to C code from there.
Is there a way to have code from, say, rtas.c go between the blr and of_rtas_end symbol? Or do I need to move the symbols to the ldscript and place the code in a special section? If yes, how?
Why do you want to have the code in between? You can just branch to the C code:
b c_rtas_function
The only thing you need to make sure is that you follow the ABI :). Input parameters go in r3-rsomething, output is in r3, stack pointer (r1) has to be valid.
Also by only doing the b instead of blr you jump to the C function directly, so a return from there actually returns from the rtas function. If the rtas function follows a different ABI, better set up a stack frame and blr into the C function.
Those symbols are being used for code size calculation and relocation in arch/ppc/qemu/methods.c.
Maybe I don't really understand the question though.
Alex
Am 08.10.2010 um 14:54 schrieb Alexander Graf:
On 08.10.2010, at 14:44, Andreas Färber wrote:
I've gotten AIX 6/7 to instantiate RTAS (patches upcoming) and would like to trace what it's trying to do. I probably need to implement the display-character token.
As you're running in qemu, the gdbstub is very helpful at times.
Not too familiar with gdb, can I use Apple's host gdb for that? I believe Blue once said something about gdb not working in *-elf configuration and requiring *-linux instead?
The RTAS code in arch/ppc/qemu/start.S currently looks like this:
GLOBL(of_rtas_start): blr GLOBL(of_rtas_end):
...and I would like to branch to C code from there.
Is there a way to have code from, say, rtas.c go between the blr and of_rtas_end symbol? Or do I need to move the symbols to the ldscript and place the code in a special section? If yes, how?
Why do you want to have the code in between? You can just branch to the C code:
b c_rtas_function
The only thing you need to make sure is that you follow the ABI :). Input parameters go in r3-rsomething, output is in r3, stack pointer (r1) has to be valid.
Also by only doing the b instead of blr you jump to the C function directly, so a return from there actually returns from the rtas function.
If the rtas function follows a different ABI, better set up a stack frame and blr into the C function.
I assumed the latter is what I should do, following the CIF example.
Those symbols are being used for code size calculation and relocation in arch/ppc/qemu/methods.c.
Maybe I don't really understand the question though.
Thanks for promptly trying!
The code in methods.c basically does a memcpy() of of_rtas_start..of_rtas_end to memory that AIX allocates for us. The code is never called at its original location so needs to be PIC. Thus, if the C functions are not copied along, any relative addressing wouldn't work and absolute addressing would break as soon as the functions get unmapped. You might think of it as a "separate" executable that OpenBIOS loads and sets up. Therefore I need a way to get one piece of memory containing the entry point, data storage and all helper functions called.
Andreas
On 08.10.2010, at 15:23, Andreas Färber wrote:
Am 08.10.2010 um 14:54 schrieb Alexander Graf:
On 08.10.2010, at 14:44, Andreas Färber wrote:
I've gotten AIX 6/7 to instantiate RTAS (patches upcoming) and would like to trace what it's trying to do. I probably need to implement the display-character token.
As you're running in qemu, the gdbstub is very helpful at times.
Not too familiar with gdb, can I use Apple's host gdb for that? I believe Blue once said something about gdb not working in *-elf configuration and requiring *-linux instead?
You can use the apple gdb without an object file, so you don't get symbols. But if you have an instruction pointer, just
$ qemu-system-ppc -s -S ... (gdb) target remote localhost:1234 (gdb) b *0x1234 <- address of rtas_something (gdb) c
It should break on that IP and then you can evaluate the register contents at least. Either by
(gdb) info registers
or
(qemu) info registers
The RTAS code in arch/ppc/qemu/start.S currently looks like this:
GLOBL(of_rtas_start): blr GLOBL(of_rtas_end):
...and I would like to branch to C code from there.
Is there a way to have code from, say, rtas.c go between the blr and of_rtas_end symbol? Or do I need to move the symbols to the ldscript and place the code in a special section? If yes, how?
Why do you want to have the code in between? You can just branch to the C code:
b c_rtas_function
The only thing you need to make sure is that you follow the ABI :). Input parameters go in r3-rsomething, output is in r3, stack pointer (r1) has to be valid.
Also by only doing the b instead of blr you jump to the C function directly, so a return from there actually returns from the rtas function.
If the rtas function follows a different ABI, better set up a stack frame and blr into the C function.
I assumed the latter is what I should do, following the CIF example.
Those symbols are being used for code size calculation and relocation in arch/ppc/qemu/methods.c.
Maybe I don't really understand the question though.
Thanks for promptly trying!
The code in methods.c basically does a memcpy() of of_rtas_start..of_rtas_end to memory that AIX allocates for us. The code is never called at its original location so needs to be PIC. Thus, if the C functions are not copied along, any relative addressing wouldn't work and absolute addressing would break as soon as the functions get unmapped. You might think of it as a "separate" executable that OpenBIOS loads and sets up. Therefore I need a way to get one piece of memory containing the entry point, data storage and all helper functions called.
Oh. That's not exactly easy. So the rtas space stays intact while all the openbios functions it wants to call don't? Hrm.
I guess you'll have to copy them down too, which can become very hairy very quick. Why don't you just create a separate binary for RTAS which can then use relative branching inside of itself? You could then copy the whole blob somewhere and be sure that everything's there.
Alex
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
On 08.10.2010, at 15:23, Andreas Färber wrote:
Am 08.10.2010 um 14:54 schrieb Alexander Graf:
On 08.10.2010, at 14:44, Andreas Färber wrote:
The RTAS code in arch/ppc/qemu/start.S currently looks like this:
GLOBL(of_rtas_start): blr GLOBL(of_rtas_end):
...and I would like to branch to C code from there.
Is there a way to have code from, say, rtas.c go between the blr and of_rtas_end symbol? Or do I need to move the symbols to the ldscript and place the code in a special section? If yes, how?
Why do you want to have the code in between? You can just branch to the C code:
b c_rtas_function
The only thing you need to make sure is that you follow the ABI :). Input parameters go in r3-rsomething, output is in r3, stack pointer (r1) has to be valid.
Also by only doing the b instead of blr you jump to the C function directly, so a return from there actually returns from the rtas function.
If the rtas function follows a different ABI, better set up a stack frame and blr into the C function.
I assumed the latter is what I should do, following the CIF example.
Those symbols are being used for code size calculation and relocation in arch/ppc/qemu/methods.c.
Maybe I don't really understand the question though.
Thanks for promptly trying!
The code in methods.c basically does a memcpy() of of_rtas_start..of_rtas_end to memory that AIX allocates for us. The code is never called at its original location so needs to be PIC. Thus, if the C functions are not copied along, any relative addressing wouldn't work and absolute addressing would break as soon as the functions get unmapped. You might think of it as a "separate" executable that OpenBIOS loads and sets up. Therefore I need a way to get one piece of memory containing the entry point, data storage and all helper functions called.
Oh. That's not exactly easy. So the rtas space stays intact while all the openbios functions it wants to call don't? Hrm.
At least that's what [1] and [2] suggested. CHRP System Architecture spec 1.0 chapter 7.1 (hrpa_103.ps) specifically says:
"The role of RTAS versus Open Firmware is very important to understand. Open Firmware and RTAS are both vendor-provided software, and both are tailored by the hardware vendor to manipulate the specific platform hardware. However, RTAS is intended to be present during the execution of the OS, and to be called by the OS to access platform hardware features on behalf of the OS, whereas Open Firmware need not be present when an OS is running. This frees Open Firmware’s memory to be used by applications. RTAS is small enough to painlessly coexist with OSs and applications."
I'd be happy to continue in some way that plays nicely with the relocation only for now though.
[1] http://www.scribd.com/doc/23367157/Concept-Design-and-Implementation-of-a-Sl... pp. 8 f. [2] http://www.slideshare.net/schihei/agnostic-device-drivers slides 16 f.
I guess you'll have to copy them down too, which can become very hairy very quick.
Why don't you just create a separate binary for RTAS which can then use relative branching inside of itself? You could then copy the whole blob somewhere and be sure that everything's there.
How? :) Assuming I managed to fork arch/ppc/qemu/{start.S,ldscript} for a separate ELF binary, what would we do with it? Would QEMU need to load it separately as a ROM with known location, like openbios-ppc? Or would you embed such a binary into OpenBIOS somehow?
In theory we need support for both bitnesses and both endiannesses, i.e. up to four executables...
And how would the interaction between OpenBIOS and RTAS work then? Would we have to duplicate all info into the RTAS private memory using an init function called during initialize-rtas? PCI for instance is supposed to be handled by RTAS (e.g., read-pci- config and write-pci-config), and the OS is supposed to call restart- rtas after reconfiguring PCI, which sort of implies probing capabilities inside RTAS...
Andreas
On 08.10.2010, at 19:11, Andreas Färber wrote:
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
On 08.10.2010, at 15:23, Andreas Färber wrote:
Am 08.10.2010 um 14:54 schrieb Alexander Graf:
On 08.10.2010, at 14:44, Andreas Färber wrote:
The RTAS code in arch/ppc/qemu/start.S currently looks like this:
GLOBL(of_rtas_start): blr GLOBL(of_rtas_end):
...and I would like to branch to C code from there.
Is there a way to have code from, say, rtas.c go between the blr and of_rtas_end symbol? Or do I need to move the symbols to the ldscript and place the code in a special section? If yes, how?
Why do you want to have the code in between? You can just branch to the C code:
b c_rtas_function
The only thing you need to make sure is that you follow the ABI :). Input parameters go in r3-rsomething, output is in r3, stack pointer (r1) has to be valid.
Also by only doing the b instead of blr you jump to the C function directly, so a return from there actually returns from the rtas function.
If the rtas function follows a different ABI, better set up a stack frame and blr into the C function.
I assumed the latter is what I should do, following the CIF example.
Those symbols are being used for code size calculation and relocation in arch/ppc/qemu/methods.c.
Maybe I don't really understand the question though.
Thanks for promptly trying!
The code in methods.c basically does a memcpy() of of_rtas_start..of_rtas_end to memory that AIX allocates for us. The code is never called at its original location so needs to be PIC. Thus, if the C functions are not copied along, any relative addressing wouldn't work and absolute addressing would break as soon as the functions get unmapped. You might think of it as a "separate" executable that OpenBIOS loads and sets up. Therefore I need a way to get one piece of memory containing the entry point, data storage and all helper functions called.
Oh. That's not exactly easy. So the rtas space stays intact while all the openbios functions it wants to call don't? Hrm.
At least that's what [1] and [2] suggested. CHRP System Architecture spec 1.0 chapter 7.1 (hrpa_103.ps) specifically says:
"The role of RTAS versus Open Firmware is very important to understand. Open Firmware and RTAS are both vendor-provided software, and both are tailored by the hardware vendor to manipulate the specific platform hardware. However, RTAS is intended to be present during the execution of the OS, and to be called by the OS to access platform hardware features on behalf of the OS, whereas Open Firmware need not be present when an OS is running. This frees Open Firmware’s memory to be used by applications. RTAS is small enough to painlessly coexist with OSs and applications."
I'd be happy to continue in some way that plays nicely with the relocation only for now though.
[1] http://www.scribd.com/doc/23367157/Concept-Design-and-Implementation-of-a-Sl... pp. 8 f. [2] http://www.slideshare.net/schihei/agnostic-device-drivers slides 16 f.
I guess you'll have to copy them down too, which can become very hairy very quick.
Why don't you just create a separate binary for RTAS which can then use relative branching inside of itself? You could then copy the whole blob somewhere and be sure that everything's there.
How? :) Assuming I managed to fork arch/ppc/qemu/{start.S,ldscript} for a separate ELF binary, what would we do with it? Would QEMU need to load it separately as a ROM with known location, like openbios-ppc? Or would you embed such a binary into OpenBIOS somehow?
You'd embed in into OpenBIOS. Just create a .c file keeping the blob as variable from the .elf binary.
In theory we need support for both bitnesses and both endiannesses, i.e. up to four executables...
I don't see why. Just compile natively for the target arch. The RTOS blob has the same characteristics as openBIOS, no?
And how would the interaction between OpenBIOS and RTAS work then? Would we have to duplicate all info into the RTAS private memory using an init function called during initialize-rtas?
Some shared struct that gets initialized in openbios and then copied into rtas space maybe?
PCI for instance is supposed to be handled by RTAS (e.g., read-pci-config and write-pci-config), and the OS is supposed to call restart-rtas after reconfiguring PCI, which sort of implies probing capabilities inside RTAS...
Sounds like work :)
Alex
Am 08.10.2010 um 19:22 schrieb Alexander Graf:
On 08.10.2010, at 19:11, Andreas Färber wrote:
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
On 08.10.2010, at 15:23, Andreas Färber wrote:
Am 08.10.2010 um 14:54 schrieb Alexander Graf:
On 08.10.2010, at 14:44, Andreas Färber wrote:
The RTAS code in arch/ppc/qemu/start.S currently looks like this:
GLOBL(of_rtas_start): blr GLOBL(of_rtas_end):
...and I would like to branch to C code from there.
Is there a way to have code from, say, rtas.c go between the blr and of_rtas_end symbol? Or do I need to move the symbols to the ldscript and place the code in a special section? If yes, how?
Why do you want to have the code in between? You can just branch to the C code:
b c_rtas_function
The only thing you need to make sure is that you follow the ABI :). Input parameters go in r3-rsomething, output is in r3, stack pointer (r1) has to be valid.
Also by only doing the b instead of blr you jump to the C function directly, so a return from there actually returns from the rtas function.
If the rtas function follows a different ABI, better set up a stack frame and blr into the C function.
I assumed the latter is what I should do, following the CIF example.
Those symbols are being used for code size calculation and relocation in arch/ppc/qemu/methods.c.
Maybe I don't really understand the question though.
Thanks for promptly trying!
The code in methods.c basically does a memcpy() of of_rtas_start..of_rtas_end to memory that AIX allocates for us. The code is never called at its original location so needs to be PIC. Thus, if the C functions are not copied along, any relative addressing wouldn't work and absolute addressing would break as soon as the functions get unmapped. You might think of it as a "separate" executable that OpenBIOS loads and sets up. Therefore I need a way to get one piece of memory containing the entry point, data storage and all helper functions called.
Oh. That's not exactly easy. So the rtas space stays intact while all the openbios functions it wants to call don't? Hrm.
At least that's what [1] and [2] suggested. CHRP System Architecture spec 1.0 chapter 7.1 (hrpa_103.ps) specifically says:
"The role of RTAS versus Open Firmware is very important to understand. Open Firmware and RTAS are both vendor-provided software, and both are tailored by the hardware vendor to manipulate the specific platform hardware. However, RTAS is intended to be present during the execution of the OS, and to be called by the OS to access platform hardware features on behalf of the OS, whereas Open Firmware need not be present when an OS is running. This frees Open Firmware’s memory to be used by applications. RTAS is small enough to painlessly coexist with OSs and applications."
I'd be happy to continue in some way that plays nicely with the relocation only for now though.
[1] http://www.scribd.com/doc/23367157/Concept-Design-and-Implementation-of-a-Sl... pp. 8 f. [2] http://www.slideshare.net/schihei/agnostic-device-drivers slides 16 f.
I guess you'll have to copy them down too, which can become very hairy very quick.
Why don't you just create a separate binary for RTAS which can then use relative branching inside of itself? You could then copy the whole blob somewhere and be sure that everything's there.
How? :) Assuming I managed to fork arch/ppc/qemu/{start.S,ldscript} for a separate ELF binary, what would we do with it? Would QEMU need to load it separately as a ROM with known location, like openbios-ppc? Or would you embed such a binary into OpenBIOS somehow?
You'd embed in into OpenBIOS. Just create a .c file keeping the blob as variable from the .elf binary.
In theory we need support for both bitnesses and both endiannesses, i.e. up to four executables...
I don't see why. Just compile natively for the target arch. The RTOS blob has the same characteristics as openBIOS, no?
Erm, OpenBIOS is lacking in that aspect. I don't see where it would respect the little-endian? option on ppc, just like the real-mode? option. And depending on what spec you read, there's both rtas- instantiate and rtas-instantiate-64, or the MSR(?) setting at RTAS initialization time decides in what mode it's supposed to run. So it can in fact be different from the compile-time setting of 32-bit big endian! :(
And how would the interaction between OpenBIOS and RTAS work then? Would we have to duplicate all info into the RTAS private memory using an init function called during initialize-rtas?
Some shared struct that gets initialized in openbios and then copied into rtas space maybe?
Maybe.
PCI for instance is supposed to be handled by RTAS (e.g., read-pci- config and write-pci-config), and the OS is supposed to call restart-rtas after reconfiguring PCI, which sort of implies probing capabilities inside RTAS...
Sounds like work :)
Calls for code sharing with OpenBIOS imo.
Andreas
On 08.10.2010, at 20:48, Andreas Färber wrote:
Am 08.10.2010 um 19:22 schrieb Alexander Graf:
On 08.10.2010, at 19:11, Andreas Färber wrote:
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
On 08.10.2010, at 15:23, Andreas Färber wrote:
Am 08.10.2010 um 14:54 schrieb Alexander Graf:
On 08.10.2010, at 14:44, Andreas Färber wrote:
> The RTAS code in arch/ppc/qemu/start.S currently looks like this: > > GLOBL(of_rtas_start): > blr > GLOBL(of_rtas_end): > > ...and I would like to branch to C code from there. > > Is there a way to have code from, say, rtas.c go between the blr and of_rtas_end symbol? > Or do I need to move the symbols to the ldscript and place the code in a special section? If yes, how?
Why do you want to have the code in between? You can just branch to the C code:
b c_rtas_function
The only thing you need to make sure is that you follow the ABI :). Input parameters go in r3-rsomething, output is in r3, stack pointer (r1) has to be valid.
Also by only doing the b instead of blr you jump to the C function directly, so a return from there actually returns from the rtas function.
If the rtas function follows a different ABI, better set up a stack frame and blr into the C function.
I assumed the latter is what I should do, following the CIF example.
> Those symbols are being used for code size calculation and relocation in arch/ppc/qemu/methods.c.
Maybe I don't really understand the question though.
Thanks for promptly trying!
The code in methods.c basically does a memcpy() of of_rtas_start..of_rtas_end to memory that AIX allocates for us. The code is never called at its original location so needs to be PIC. Thus, if the C functions are not copied along, any relative addressing wouldn't work and absolute addressing would break as soon as the functions get unmapped. You might think of it as a "separate" executable that OpenBIOS loads and sets up. Therefore I need a way to get one piece of memory containing the entry point, data storage and all helper functions called.
Oh. That's not exactly easy. So the rtas space stays intact while all the openbios functions it wants to call don't? Hrm.
At least that's what [1] and [2] suggested. CHRP System Architecture spec 1.0 chapter 7.1 (hrpa_103.ps) specifically says:
"The role of RTAS versus Open Firmware is very important to understand. Open Firmware and RTAS are both vendor-provided software, and both are tailored by the hardware vendor to manipulate the specific platform hardware. However, RTAS is intended to be present during the execution of the OS, and to be called by the OS to access platform hardware features on behalf of the OS, whereas Open Firmware need not be present when an OS is running. This frees Open Firmware’s memory to be used by applications. RTAS is small enough to painlessly coexist with OSs and applications."
I'd be happy to continue in some way that plays nicely with the relocation only for now though.
[1] http://www.scribd.com/doc/23367157/Concept-Design-and-Implementation-of-a-Sl... pp. 8 f. [2] http://www.slideshare.net/schihei/agnostic-device-drivers slides 16 f.
I guess you'll have to copy them down too, which can become very hairy very quick.
Why don't you just create a separate binary for RTAS which can then use relative branching inside of itself? You could then copy the whole blob somewhere and be sure that everything's there.
How? :) Assuming I managed to fork arch/ppc/qemu/{start.S,ldscript} for a separate ELF binary, what would we do with it? Would QEMU need to load it separately as a ROM with known location, like openbios-ppc? Or would you embed such a binary into OpenBIOS somehow?
You'd embed in into OpenBIOS. Just create a .c file keeping the blob as variable from the .elf binary.
In theory we need support for both bitnesses and both endiannesses, i.e. up to four executables...
I don't see why. Just compile natively for the target arch. The RTOS blob has the same characteristics as openBIOS, no?
Erm, OpenBIOS is lacking in that aspect. I don't see where it would respect the little-endian? option on ppc, just like the real-mode? option. And depending on what spec you read, there's both rtas-instantiate and rtas-instantiate-64, or the MSR(?) setting at RTAS initialization time decides in what mode it's supposed to run. So it can in fact be different from the compile-time setting of 32-bit big endian! :(
Hrm. Maybe you can get away with the same trick I pulled for openBIOS and just do everything 32-bit. Maybe you don't. I don't quite know the details of RTAS, just that everyone who knows about it hates it :).
And how would the interaction between OpenBIOS and RTAS work then? Would we have to duplicate all info into the RTAS private memory using an init function called during initialize-rtas?
Some shared struct that gets initialized in openbios and then copied into rtas space maybe?
Maybe.
PCI for instance is supposed to be handled by RTAS (e.g., read-pci-config and write-pci-config), and the OS is supposed to call restart-rtas after reconfiguring PCI, which sort of implies probing capabilities inside RTAS...
Sounds like work :)
Calls for code sharing with OpenBIOS imo.
Code sharing, yes. Object sharing, no. You could still have the rtas blob be compiled from openBIOS .c files.
Alex
Am 08.10.2010 um 19:22 schrieb Alexander Graf:
On 08.10.2010, at 19:11, Andreas Färber wrote:
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
Why don't you just create a separate binary for RTAS which can then use relative branching inside of itself? You could then copy the whole blob somewhere and be sure that everything's there.
How? :) Assuming I managed to fork arch/ppc/qemu/{start.S,ldscript} for a separate ELF binary, what would we do with it? Would QEMU need to load it separately as a ROM with known location, like openbios-ppc? Or would you embed such a binary into OpenBIOS somehow?
You'd embed in into OpenBIOS. Just create a .c file keeping the blob as variable from the .elf binary.
Sorry to bug... I have a very simple rtas.S now, linked into an rtas- qemu.elf. What now?
i) Is there an existing way to get my binary file into a .c file variable, or do I need to write my own generator in C? (or worse, Forth)
ii) To do anything with the blob I need to load it properly as an ELF image, and libopenbios/elf_load.c:elf_load() requires an ihandle. How would I get an ihandle from a variable in a .c file? It doesn't let me specify the (client-supplied) memory address either.
iii) Once I've loaded the ELF image, how do I get the function's entry point? OpenBIOS itself just uses the CPU's reset vector and branches to _entry from there - anything I need to add/change in the ldscript?
Thanks, Andreas
On Mon, Oct 11, 2010 at 7:18 PM, Andreas Färber andreas.faerber@web.de wrote:
Am 08.10.2010 um 19:22 schrieb Alexander Graf:
On 08.10.2010, at 19:11, Andreas Färber wrote:
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
Why don't you just create a separate binary for RTAS which can then use relative branching inside of itself? You could then copy the whole blob somewhere and be sure that everything's there.
How? :) Assuming I managed to fork arch/ppc/qemu/{start.S,ldscript} for a separate ELF binary, what would we do with it? Would QEMU need to load it separately as a ROM with known location, like openbios-ppc? Or would you embed such a binary into OpenBIOS somehow?
You'd embed in into OpenBIOS. Just create a .c file keeping the blob as variable from the .elf binary.
Sorry to bug... I have a very simple rtas.S now, linked into an rtas-qemu.elf. What now?
i) Is there an existing way to get my binary file into a .c file variable, or do I need to write my own generator in C? (or worse, Forth)
The dictionary blobs are transformed to C with hexdump.
ii) To do anything with the blob I need to load it properly as an ELF image, and libopenbios/elf_load.c:elf_load() requires an ihandle. How would I get an ihandle from a variable in a .c file? It doesn't let me specify the (client-supplied) memory address either.
I'd not use ELF here but binary blobs, real ROMs don't have any ELF headers.
iii) Once I've loaded the ELF image, how do I get the function's entry point? OpenBIOS itself just uses the CPU's reset vector and branches to _entry from there - anything I need to add/change in the ldscript?
You could assume that the first location is the starting point. This can be arranged by the linker script (special section for the starting point).
Am 12.10.2010 um 20:50 schrieb Blue Swirl:
On Mon, Oct 11, 2010 at 7:18 PM, Andreas Färber <andreas.faerber@web.de
wrote: Am 08.10.2010 um 19:22 schrieb Alexander Graf:
On 08.10.2010, at 19:11, Andreas Färber wrote:
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
Why don't you just create a separate binary for RTAS which can then use relative branching inside of itself? You could then copy the whole blob somewhere and be sure that everything's there.
How? :) Assuming I managed to fork arch/ppc/qemu/ {start.S,ldscript} for a separate ELF binary, what would we do with it? Would QEMU need to load it separately as a ROM with known location, like openbios-ppc? Or would you embed such a binary into OpenBIOS somehow?
You'd embed in into OpenBIOS. Just create a .c file keeping the blob as variable from the .elf binary.
Sorry to bug... I have a very simple rtas.S now, linked into an rtas-qemu.elf. What now?
i) Is there an existing way to get my binary file into a .c file variable, or do I need to write my own generator in C? (or worse, Forth)
The dictionary blobs are transformed to C with hexdump.
Thanks for the pointer, that works quite nicely.
ii) To do anything with the blob I need to load it properly as an ELF image, and libopenbios/elf_load.c:elf_load() requires an ihandle. How would I get an ihandle from a variable in a .c file? It doesn't let me specify the (client-supplied) memory address either.
I'd not use ELF here but binary blobs, real ROMs don't have any ELF headers.
Ah, I found OUTPUT_FORMAT(binary). Didn't know we could emit non-ELF formats with --target=*-elf.
iii) Once I've loaded the ELF image, how do I get the function's entry point? OpenBIOS itself just uses the CPU's reset vector and branches to _entry from there - anything I need to add/change in the ldscript?
You could assume that the first location is the starting point. This can be arranged by the linker script (special section for the starting point).
I was successful using a section ".rtasentry", but since I get r4 as a pointer to the start of memory, I'll rather use a known offset.
I got OpenBIOS to load my RTAS blob and AIX to enter my RTAS C function and emit something to escc uart from there. It looks like AIX is trying to emit "\nAIX " via display-character RTAS calls. My code's not very stable though, the same code snippet didn't work inside a [{static,inline}] serial_putchar() function but did directly in my main function called from assembly. Still poking at ldscript and assembler to find the cause or a workaround. Right now it does print okay but hangs on returning from the RTAS call. Will post an RFC once I've cleaned it up a little, maybe it's something obvious...
Andreas
Andreas Färber wrote:
I was successful using a section ".rtasentry", but since I get r4 as a pointer to the start of memory, I'll rather use a known offset.
I got OpenBIOS to load my RTAS blob and AIX to enter my RTAS C function and emit something to escc uart from there. It looks like AIX is trying to emit "\nAIX " via display-character RTAS calls. My code's not very stable though, the same code snippet didn't work inside a [{static,inline}] serial_putchar() function but did directly in my main function called from assembly. Still poking at ldscript and assembler to find the cause or a workaround. Right now it does print okay but hangs on returning from the RTAS call. Will post an RFC once I've cleaned it up a little, maybe it's something obvious...
Andreas
I don't know if it helps with PPC, but on SPARC64 there is an issue whereby if you are compiling with -O0 (which is typical during development), you need to allow extra stack space when switching contexts. Does compiling with -Os solve the issue at all?
ATB,
Mark.
Am 14.10.2010 um 18:45 schrieb Mark Cave-Ayland:
Andreas Färber wrote:
I got OpenBIOS to load my RTAS blob and AIX to enter my RTAS C function and emit something to escc uart from there. It looks like AIX is trying to emit "\nAIX " via display-character RTAS calls. My code's not very stable though, the same code snippet didn't work inside a [{static,inline}] serial_putchar() function but did directly in my main function called from assembly. Still poking at ldscript and assembler to find the cause or a workaround. Right now it does print okay but hangs on returning from the RTAS call. Will post an RFC once I've cleaned it up a little, maybe it's something obvious... Andreas
I don't know if it helps with PPC, but on SPARC64 there is an issue whereby if you are compiling with -O0 (which is typical during development), you need to allow extra stack space when switching contexts. Does compiling with -Os solve the issue at all?
Seems like it was compiling with -Os by default.
Andreas
Am 16.10.2010 um 13:43 schrieb Andreas Färber:
Am 14.10.2010 um 18:45 schrieb Mark Cave-Ayland:
Andreas Färber wrote:
I got OpenBIOS to load my RTAS blob and AIX to enter my RTAS C function and emit something to escc uart from there. It looks like AIX is trying to emit "\nAIX " via display-character RTAS calls. My code's not very stable though, the same code snippet didn't work inside a [{static,inline}] serial_putchar() function but did directly in my main function called from assembly. Still poking at ldscript and assembler to find the cause or a workaround. Right now it does print okay but hangs on returning from the RTAS call. Will post an RFC once I've cleaned it up a little, maybe it's something obvious... Andreas
I don't know if it helps with PPC, but on SPARC64 there is an issue whereby if you are compiling with -O0 (which is typical during development), you need to allow extra stack space when switching contexts. Does compiling with -Os solve the issue at all?
Seems like it was compiling with -Os by default.
Update: I've been successful with the combination of -lgcc link and rtas-tokens.c compilation with -O2 -fpie (-O2 solved the _savegpr_31 issue). The _GLOBAL_OFFSET_TABLE_ issue remains though, I've been extending the rtas-ldscript without success so far.
Andreas
On Sat, Oct 16, 2010 at 5:18 PM, Andreas Färber andreas.faerber@web.de wrote:
Am 16.10.2010 um 13:43 schrieb Andreas Färber:
Am 14.10.2010 um 18:45 schrieb Mark Cave-Ayland:
Andreas Färber wrote:
I got OpenBIOS to load my RTAS blob and AIX to enter my RTAS C function and emit something to escc uart from there. It looks like AIX is trying to emit "\nAIX " via display-character RTAS calls. My code's not very stable though, the same code snippet didn't work inside a [{static,inline}] serial_putchar() function but did directly in my main function called from assembly. Still poking at ldscript and assembler to find the cause or a workaround. Right now it does print okay but hangs on returning from the RTAS call. Will post an RFC once I've cleaned it up a little, maybe it's something obvious... Andreas
I don't know if it helps with PPC, but on SPARC64 there is an issue whereby if you are compiling with -O0 (which is typical during development), you need to allow extra stack space when switching contexts. Does compiling with -Os solve the issue at all?
Seems like it was compiling with -Os by default.
Update: I've been successful with the combination of -lgcc link and rtas-tokens.c compilation with -O2 -fpie (-O2 solved the _savegpr_31 issue). The _GLOBAL_OFFSET_TABLE_ issue remains though, I've been extending the rtas-ldscript without success so far.
Perhaps the flag is missing somewhere, like linker flags?
CHRP spec 1.0 specifies it as 1. On my PowerMac G3 it's 0x41 though.
Signed-off-by: Andreas Färber andreas.faerber@web.de --- arch/ppc/qemu/init.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index 6601b7c..2b0b891 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -748,6 +748,7 @@ arch_of_init( void ) while( size < (unsigned long)of_rtas_end - (unsigned long)of_rtas_start ) size *= 2; set_property( ph, "rtas-size", (char*)&size, sizeof(size) ); + set_int_property(ph, "rtas-version", 1); } #endif
From CHRP bindings 1.5 draft.
For now just use it an an alias to instantiate-rtas.
Signed-off-by: Andreas Färber andreas.faerber@web.de --- arch/ppc/qemu/methods.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/arch/ppc/qemu/methods.c b/arch/ppc/qemu/methods.c index 71a364f..f27d532 100644 --- a/arch/ppc/qemu/methods.c +++ b/arch/ppc/qemu/methods.c @@ -56,6 +56,7 @@ rtas_instantiate( void ) NODE_METHODS( rtas ) = { { "instantiate", rtas_instantiate }, { "instantiate-rtas", rtas_instantiate }, + { "instantiate-rtas-64", rtas_instantiate }, }; #endif
Move RTAS code into an external binary blob. Implement the display-character token, add some debug output.
The serial_putchar() calls are working now. Hangs on return of the first RTAS call though. --- arch/ppc/build.xml | 26 ++++++++++++- arch/ppc/qemu/init.c | 7 +++- arch/ppc/qemu/kernel.h | 1 - arch/ppc/qemu/methods.c | 11 +++-- arch/ppc/qemu/rtas-ldscript | 47 +++++++++++++++++++++++ arch/ppc/qemu/rtas-tokens.c | 63 ++++++++++++++++++++++++++++++ arch/ppc/qemu/rtas.S | 88 +++++++++++++++++++++++++++++++++++++++++++ arch/ppc/qemu/start.S | 7 --- 8 files changed, 236 insertions(+), 14 deletions(-) create mode 100644 arch/ppc/qemu/rtas-ldscript create mode 100644 arch/ppc/qemu/rtas-tokens.c create mode 100644 arch/ppc/qemu/rtas.S
diff --git a/arch/ppc/build.xml b/arch/ppc/build.xml index 9778a43..f893abc 100644 --- a/arch/ppc/build.xml +++ b/arch/ppc/build.xml @@ -89,6 +89,22 @@ $(call quiet-command,$(CC) $$EXTRACFLAGS $(CFLAGS) $(INCLUDES) -c -o $@ $(SRCDIR)/arch/ppc/mol/kernel.c, " CC $(TARGET_DIR)$@")]]></rule> </executable>
+ + <executable name="target/include/qemu-rtas.h" target="target" condition="QEMU"> + <rule><![CDATA[ + $(call quiet-command,true, " GEN $(TARGET_DIR)$@") + @echo "static const char rtas_binary[] = {" > $@ + @cat $< | hexdump -ve '1/0 "\t" 8/1 "0x%02x, " 1/0 "\n"' \ + | sed 's/0x ,//g' >> $@ + @echo "};" >> $@]]></rule> + <external-object source="rtas-qemu.bin"/> + </executable> + + <executable name="target/arch/ppc/qemu/methods.o" target="target" condition="QEMU"> + <rule><![CDATA[ $(SRCDIR)/arch/ppc/qemu/methods.c $(ODIR)/target/include/qemu-rtas.h + $(call quiet-command,$(CC) $$EXTRACFLAGS $(CFLAGS) $(INCLUDES) -I$(SRCDIR)/arch/ppc -c -o $@ $(SRCDIR)/arch/ppc/qemu/methods.c, " CC $(TARGET_DIR)$@")]]></rule> + </executable> + <!-- END OF HACK ALERT -->
<library name="briq" target="target" type="static" condition="BRIQ"> @@ -123,7 +139,7 @@ <object source="qemu/init.c" flags="-I$(SRCDIR)/arch/ppc"/> <external-object source="target/arch/ppc/qemu/kernel.o"/> <object source="qemu/main.c" flags="-I$(SRCDIR)/arch/ppc"/> - <object source="qemu/methods.c" flags="-I$(SRCDIR)/arch/ppc"/> + <external-object source="target/arch/ppc/qemu/methods.o"/> <object source="qemu/vfd.c" flags="-I$(SRCDIR)/arch/ppc"/> <object source="qemu/console.c" flags="-I$(SRCDIR)/arch/ppc"/> </library> @@ -193,6 +209,14 @@ <external-object source="libgcc.a"/> </executable>
+ <executable name="rtas-qemu.bin" target="target" condition="QEMU"> + <rule> + $(call quiet-command,$(LD) --warn-common -N -T $(SRCDIR)/arch/$(ARCH)/qemu/rtas-ldscript -o $@ --whole-archive --pic-executable $^," LINK $(TARGET_DIR)$@")</rule> + <object source="qemu/rtas.S"/> + <object source="qemu/rtas-tokens.c" flags="-std=c99 -fpic -DPIC"/> + <external-object source="libgcc.a"/> + </executable> + <executable name="openbios-mol.elf" target="target" condition="MOL"> <rule> $(call quiet-command,$(LD) -g -Ttext=0x01e01000 -Bstatic $^ $(shell $(CC) -print-libgcc-file-name) -o $@.nostrip --whole-archive $^," LINK $(TARGET_DIR)$@") diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index 2b0b891..6d72386 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -579,6 +579,10 @@ static void kvm_of_init(void) fword("finish-device"); }
+#ifdef CONFIG_RTAS +extern int rtas_size; +#endif + void arch_of_init( void ) { @@ -745,10 +749,11 @@ arch_of_init( void ) printk("Warning: No /rtas node\n"); else { unsigned long size = 0x1000; - while( size < (unsigned long)of_rtas_end - (unsigned long)of_rtas_start ) + while ( size < rtas_size ) size *= 2; set_property( ph, "rtas-size", (char*)&size, sizeof(size) ); set_int_property(ph, "rtas-version", 1); + set_int_property(ph, "display-character", 1); } #endif
diff --git a/arch/ppc/qemu/kernel.h b/arch/ppc/qemu/kernel.h index e8ae364..6ae928f 100644 --- a/arch/ppc/qemu/kernel.h +++ b/arch/ppc/qemu/kernel.h @@ -21,7 +21,6 @@ extern void exit( int status );
/* start.S */ extern void flush_icache_range( char *start, char *stop ); -extern char of_rtas_start[], of_rtas_end[]; extern void call_elf( unsigned long arg1, unsigned long arg2, unsigned long elf_entry );
/* methods.c */ diff --git a/arch/ppc/qemu/methods.c b/arch/ppc/qemu/methods.c index f27d532..00444a9 100644 --- a/arch/ppc/qemu/methods.c +++ b/arch/ppc/qemu/methods.c @@ -33,24 +33,27 @@ #ifdef CONFIG_RTAS DECLARE_NODE( rtas, INSTALL_OPEN, 0, "+/rtas" );
+#include "qemu-rtas.h" +#define RTAS_DATA_SIZE 0x1000 +const int rtas_size = RTAS_DATA_SIZE + sizeof(rtas_binary); + /* ( physbase -- rtas_callback ) */ static void rtas_instantiate( void ) { ucell physbase = POP(); - ucell s=0x1000, size = (ucell)of_rtas_end - (ucell)of_rtas_start; + ucell s=0x1000, size = RTAS_DATA_SIZE + sizeof(rtas_binary); unsigned long virt;
while( s < size ) s += 0x1000; virt = ofmem_claim_virt( 0, s, 0x1000 ); ofmem_map( physbase, virt, s, -1 ); - memcpy( (char*)virt, of_rtas_start, size ); + memcpy( (char*)virt + RTAS_DATA_SIZE, rtas_binary, rtas_size - RTAS_DATA_SIZE );
- printk("RTAS instantiated at %08x\n", physbase ); flush_icache_range( (char*)virt, (char*)virt + size );
- PUSH( physbase ); + PUSH( physbase + RTAS_DATA_SIZE ); }
NODE_METHODS( rtas ) = { diff --git a/arch/ppc/qemu/rtas-ldscript b/arch/ppc/qemu/rtas-ldscript new file mode 100644 index 0000000..f596eb7 --- /dev/null +++ b/arch/ppc/qemu/rtas-ldscript @@ -0,0 +1,47 @@ +OUTPUT_FORMAT(binary) +OUTPUT_ARCH(powerpc) + +SECTIONS +{ + _start = .; + + /*. = 0x1000;*/ + .rtasentry ALIGN(4096): { *(.rtasentry) } + + /* Normal sections */ + .text ALIGN(4096): { + *(.text) + *(.text.*) + } + + .rodata ALIGN(4096): { + _rodata = .; + *(.rodata) + *(.rodata.*) + *(.note.ELFBoot) + } + .data ALIGN(4096): { + _data = .; + *(.data) + *(.data.*) + _edata = .; + } + + .bss ALIGN(4096): { + _bss = .; + *(.sbss) + *(.sbss.*) + *(.bss) + *(.bss.*) + *(COMMON) + _ebss = .; + } + + . = ALIGN(4096); + _end = .; + + /* We discard .note sections other than .note.ELFBoot, + * because some versions of GCC generate useless ones. */ + + /DISCARD/ : { *(.comment*) *(.note.*) } +} diff --git a/arch/ppc/qemu/rtas-tokens.c b/arch/ppc/qemu/rtas-tokens.c new file mode 100644 index 0000000..f251716 --- /dev/null +++ b/arch/ppc/qemu/rtas-tokens.c @@ -0,0 +1,63 @@ +/* + * Copyright (c) 2010 Andreas Färber andreas.faerber@web.de + */ + +#define RTAS_MAX_ARGS 10 + +typedef struct rtas_args { + unsigned long token; + long nargs; + long nret; + unsigned long args[RTAS_MAX_ARGS]; +} rtas_args_t; + +void rtas_interface(rtas_args_t*, void*); + +/* drivers/escc.h */ +#define IO_ESCC_OFFSET 0x00013000 +/* drivers/escc.c */ +#define CTRL(addr) (*(volatile unsigned char *)(addr)) +#define DATA(addr) (*(volatile unsigned char *)(addr + 16)) +#define Tx_BUF_EMP 0x4 /* Tx Buffer empty */ + +/*static void uart_putchar(int port, unsigned char c) +{ + while (!(CTRL(port) & Tx_BUF_EMP)) + ; + DATA(port) = c; +}*/ + +/*void serial_putchar(char);*/ + +static void serial_putchar(char c) +{ + unsigned long addr = 0x80800000; + volatile unsigned char *serial_dev = (unsigned char *)addr + IO_ESCC_OFFSET + 0x20; + //uart_putchar((int)serial_dev, c); + volatile unsigned char * port = serial_dev; + while (!(CTRL(port) & Tx_BUF_EMP)) + ; + DATA(port) = c; +} + +enum { + DISPLAY_CHARACTER = 1, +}; + +void rtas_interface(rtas_args_t* params, void* privateData) +{ + switch (params->token) { + case DISPLAY_CHARACTER: { + serial_putchar((char)params->args[0]); + serial_putchar('x'); + params->args[params->nargs] = 0; + break; + } + default: + serial_putchar('.'); + params->args[params->nargs] = -1; + break; + } + serial_putchar('\r'); + serial_putchar('\n'); +} diff --git a/arch/ppc/qemu/rtas.S b/arch/ppc/qemu/rtas.S new file mode 100644 index 0000000..35037c4 --- /dev/null +++ b/arch/ppc/qemu/rtas.S @@ -0,0 +1,88 @@ +/* + * RTAS blob for QEMU + * Copyright (c) 2010 Andreas Färber andreas.faerber@web.de + */ + +#include "asm/asmdefs.h" + +/*.data +.space xxx, 0 */ + +.section .rtasentry,"ax" + /* real mode! */ +GLOBL(_entry): + /* + * r3 = arguments + * r4 = private memory + */ + stw r1, 0(r4) + stw r2, 4(r4) + + stwu r1, -12(r1) + stw r4, 8(r1) + mflr r0 + stw r0, 4(r1) + + stw r13, 8(r4) + stw r14, 12(r4) + stw r15, 16(r4) + stw r16, 20(r4) + stw r17, 24(r4) + stw r18, 28(r4) + stw r19, 32(r4) + stw r20, 36(r4) + stw r21, 40(r4) + stw r22, 44(r4) + stw r23, 48(r4) + stw r24, 52(r4) + stw r25, 56(r4) + stw r26, 60(r4) + stw r27, 64(r4) + stw r28, 68(r4) + stw r29, 72(r4) + stw r30, 76(r4) + stw r31, 80(r4) + + mfcr r2 + stw r2, 84(r4) + + bl rtas_interface + + lwz r0, 4(r1) + mtlr r0 + lwz r4, 8(r1) +// lwz r1, 0(r1) + + lwz r2, 84(r4) + mtcr r2 + + lwz r13, 8(r4) + lwz r14, 12(r4) + lwz r15, 16(r4) + lwz r16, 20(r4) + lwz r17, 24(r4) + lwz r18, 28(r4) + lwz r19, 32(r4) + lwz r20, 36(r4) + lwz r21, 40(r4) + lwz r22, 44(r4) + lwz r23, 48(r4) + lwz r24, 52(r4) + lwz r25, 56(r4) + lwz r26, 60(r4) + lwz r27, 64(r4) + lwz r28, 68(r4) + lwz r29, 72(r4) + lwz r30, 76(r4) + lwz r31, 80(r4) + + lwz r2, 4(r4) + lwz r1, 0(r4) + + blr + +/* libgcc.a */ +.globl __divide_error +__divide_error: +1: nop + b 1b diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index c995581..d9c61a7 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -522,13 +522,6 @@ GLOBL(of_client_callback):
blr
- /* rtas glue (must be reloctable) */ -GLOBL(of_rtas_start): - /* r3 = argument buffer, r4 = of_rtas_start */ - /* according to the CHRP standard, cr must be preserved (cr0/cr1 too?) */ - blr -GLOBL(of_rtas_end): -
#define CACHE_LINE_SIZE 32 #define LG_CACHE_LINE_SIZE 5
Am 15.10.2010 um 00:17 schrieb Andreas Färber andreas.faerber@web.de:
Move RTAS code into an external binary blob. Implement the display-character token, add some debug output.
Oh nice :)
The serial_putchar() calls are working now. Hangs on return of the first RTAS call though.
Aww :(
arch/ppc/build.xml | 26 ++++++++++++- arch/ppc/qemu/init.c | 7 +++- arch/ppc/qemu/kernel.h | 1 - arch/ppc/qemu/methods.c | 11 +++-- arch/ppc/qemu/rtas-ldscript | 47 +++++++++++++++++++++++ arch/ppc/qemu/rtas-tokens.c | 63 ++++++++++++++++++++++++++++++ arch/ppc/qemu/rtas.S | 88 +++++++++++++++++++++++++++++++++++++++++++ arch/ppc/qemu/start.S | 7 --- 8 files changed, 236 insertions(+), 14 deletions(-) create mode 100644 arch/ppc/qemu/rtas-ldscript create mode 100644 arch/ppc/qemu/rtas-tokens.c create mode 100644 arch/ppc/qemu/rtas.S
diff --git a/arch/ppc/build.xml b/arch/ppc/build.xml index 9778a43..f893abc 100644 --- a/arch/ppc/build.xml +++ b/arch/ppc/build.xml @@ -89,6 +89,22 @@ $(call quiet-command,$(CC) $$EXTRACFLAGS $(CFLAGS) $(INCLUDES) -c -o $@ $(SRCDIR)/arch/ppc/mol/kernel.c, " CC $(TARGET_DIR)$@")]]></rule>
</executable>
<executable name="target/include/qemu-rtas.h" target="target" condition="QEMU">
- <rule><![CDATA[
- $(call quiet-command,true, " GEN $(TARGET_DIR)$@")
- @echo "static const char rtas_binary[] = {" > $@
- @cat $< | hexdump -ve '1/0 "\t" 8/1 "0x%02x, " 1/0 "\n"' \
| sed 's/0x ,//g' >> $@
- @echo "};" >> $@]]></rule>
<external-object source="rtas-qemu.bin"/>
</executable>
<executable name="target/arch/ppc/qemu/methods.o" target="target" condition="QEMU">
- <rule><![CDATA[ $(SRCDIR)/arch/ppc/qemu/methods.c $(ODIR)/target/include/qemu-rtas.h
- $(call quiet-command,$(CC) $$EXTRACFLAGS $(CFLAGS) $(INCLUDES) -I$(SRCDIR)/arch/ppc -c -o $@ $(SRCDIR)/arch/ppc/qemu/methods.c, " CC $(TARGET_DIR)$@")]]></rule>
</executable>
<!-- END OF HACK ALERT -->
<library name="briq" target="target" type="static" condition="BRIQ"> @@ -123,7 +139,7 @@ <object source="qemu/init.c" flags="-I$(SRCDIR)/arch/ppc"/> <external-object source="target/arch/ppc/qemu/kernel.o"/> <object source="qemu/main.c" flags="-I$(SRCDIR)/arch/ppc"/> - <object source="qemu/methods.c" flags="-I$(SRCDIR)/arch/ppc"/> + <external-object source="target/arch/ppc/qemu/methods.o"/> <object source="qemu/vfd.c" flags="-I$(SRCDIR)/arch/ppc"/> <object source="qemu/console.c" flags="-I$(SRCDIR)/arch/ppc"/> </library> @@ -193,6 +209,14 @@ <external-object source="libgcc.a"/> </executable>
<executable name="rtas-qemu.bin" target="target" condition="QEMU">
<rule>
- $(call quiet-command,$(LD) --warn-common -N -T $(SRCDIR)/arch/$(ARCH)/qemu/rtas-ldscript -o $@ --whole-archive --pic-executable $^," LINK $(TARGET_DIR)$@")</rule>
<object source="qemu/rtas.S"/>
<object source="qemu/rtas-tokens.c" flags="-std=c99 -fpic -DPIC"/>
<external-object source="libgcc.a"/>
</executable>
<executable name="openbios-mol.elf" target="target" condition="MOL"> <rule> $(call quiet-command,$(LD) -g -Ttext=0x01e01000 -Bstatic $^ $(shell $(CC) -print-libgcc-file-name) -o $@.nostrip --whole-archive $^," LINK $(TARGET_DIR)$@") diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index 2b0b891..6d72386 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -579,6 +579,10 @@ static void kvm_of_init(void) fword("finish-device"); }
+#ifdef CONFIG_RTAS +extern int rtas_size; +#endif
void arch_of_init( void ) { @@ -745,10 +749,11 @@ arch_of_init( void ) printk("Warning: No /rtas node\n"); else { unsigned long size = 0x1000;
while( size < (unsigned long)of_rtas_end - (unsigned long)of_rtas_start )
while ( size < rtas_size ) size *= 2; set_property( ph, "rtas-size", (char*)&size, sizeof(size) ); set_int_property(ph, "rtas-version", 1);
Didn't you just set this to 0x41?
}set_int_property(ph, "display-character", 1);
#endif
diff --git a/arch/ppc/qemu/kernel.h b/arch/ppc/qemu/kernel.h index e8ae364..6ae928f 100644 --- a/arch/ppc/qemu/kernel.h +++ b/arch/ppc/qemu/kernel.h @@ -21,7 +21,6 @@ extern void exit( int status );
/* start.S */ extern void flush_icache_range( char *start, char *stop ); -extern char of_rtas_start[], of_rtas_end[]; extern void call_elf( unsigned long arg1, unsigned long arg2, unsigned long elf_entry );
/* methods.c */ diff --git a/arch/ppc/qemu/methods.c b/arch/ppc/qemu/methods.c index f27d532..00444a9 100644 --- a/arch/ppc/qemu/methods.c +++ b/arch/ppc/qemu/methods.c @@ -33,24 +33,27 @@ #ifdef CONFIG_RTAS DECLARE_NODE( rtas, INSTALL_OPEN, 0, "+/rtas" );
+#include "qemu-rtas.h" +#define RTAS_DATA_SIZE 0x1000 +const int rtas_size = RTAS_DATA_SIZE + sizeof(rtas_binary);
/* ( physbase -- rtas_callback ) */ static void rtas_instantiate( void ) { ucell physbase = POP();
- ucell s=0x1000, size = (ucell)of_rtas_end - (ucell)of_rtas_start;
ucell s=0x1000, size = RTAS_DATA_SIZE + sizeof(rtas_binary); unsigned long virt;
while( s < size ) s += 0x1000; virt = ofmem_claim_virt( 0, s, 0x1000 ); ofmem_map( physbase, virt, s, -1 );
- memcpy( (char*)virt, of_rtas_start, size );
- memcpy( (char*)virt + RTAS_DATA_SIZE, rtas_binary, rtas_size - RTAS_DATA_SIZE );
printk("RTAS instantiated at %08x\n", physbase ); flush_icache_range( (char*)virt, (char*)virt + size );
PUSH( physbase );
- PUSH( physbase + RTAS_DATA_SIZE );
}
NODE_METHODS( rtas ) = { diff --git a/arch/ppc/qemu/rtas-ldscript b/arch/ppc/qemu/rtas-ldscript new file mode 100644 index 0000000..f596eb7 --- /dev/null +++ b/arch/ppc/qemu/rtas-ldscript @@ -0,0 +1,47 @@ +OUTPUT_FORMAT(binary) +OUTPUT_ARCH(powerpc)
+SECTIONS +{
- _start = .;
- /*. = 0x1000;*/
- .rtasentry ALIGN(4096): { *(.rtasentry) }
- /* Normal sections */
- .text ALIGN(4096): {
*(.text)
*(.text.*)
- }
- .rodata ALIGN(4096): {
_rodata = .;
*(.rodata)
*(.rodata.*)
*(.note.ELFBoot)
- }
- .data ALIGN(4096): {
_data = .;
*(.data)
*(.data.*)
_edata = .;
- }
- .bss ALIGN(4096): {
_bss = .;
*(.sbss)
*(.sbss.*)
*(.bss)
*(.bss.*)
*(COMMON)
_ebss = .;
- }
- . = ALIGN(4096);
- _end = .;
- /* We discard .note sections other than .note.ELFBoot,
* because some versions of GCC generate useless ones. */
- /DISCARD/ : { *(.comment*) *(.note.*) }
+} diff --git a/arch/ppc/qemu/rtas-tokens.c b/arch/ppc/qemu/rtas-tokens.c new file mode 100644 index 0000000..f251716 --- /dev/null +++ b/arch/ppc/qemu/rtas-tokens.c @@ -0,0 +1,63 @@ +/*
- Copyright (c) 2010 Andreas Färber andreas.faerber@web.de
- */
This is missing a copyleft license.
+#define RTAS_MAX_ARGS 10
+typedef struct rtas_args {
- unsigned long token;
- long nargs;
- long nret;
- unsigned long args[RTAS_MAX_ARGS];
+} rtas_args_t;
+void rtas_interface(rtas_args_t*, void*);
+/* drivers/escc.h */ +#define IO_ESCC_OFFSET 0x00013000 +/* drivers/escc.c */ +#define CTRL(addr) (*(volatile unsigned char *)(addr)) +#define DATA(addr) (*(volatile unsigned char *)(addr + 16)) +#define Tx_BUF_EMP 0x4 /* Tx Buffer empty */
+/*static void uart_putchar(int port, unsigned char c) +{
- while (!(CTRL(port) & Tx_BUF_EMP))
;
- DATA(port) = c;
+}*/
+/*void serial_putchar(char);*/
+static void serial_putchar(char c) +{
- unsigned long addr = 0x80800000;
Phew - how is this done for the normal escc case? We should at least share the constants here.
- volatile unsigned char *serial_dev = (unsigned char *)addr + IO_ESCC_OFFSET + 0x20;
- //uart_putchar((int)serial_dev, c);
- volatile unsigned char * port = serial_dev;
- while (!(CTRL(port) & Tx_BUF_EMP))
;
- DATA(port) = c;
+}
+enum {
- DISPLAY_CHARACTER = 1,
+};
+void rtas_interface(rtas_args_t* params, void* privateData) +{
- switch (params->token) {
case DISPLAY_CHARACTER: {
serial_putchar((char)params->args[0]);
serial_putchar('x');
params->args[params->nargs] = 0;
break;
}
default:
serial_putchar('.');
params->args[params->nargs] = -1;
break;
- }
- serial_putchar('\r');
- serial_putchar('\n');
+} diff --git a/arch/ppc/qemu/rtas.S b/arch/ppc/qemu/rtas.S new file mode 100644 index 0000000..35037c4 --- /dev/null +++ b/arch/ppc/qemu/rtas.S @@ -0,0 +1,88 @@ +/*
- RTAS blob for QEMU
- Copyright (c) 2010 Andreas Färber andreas.faerber@web.de
- */
+#include "asm/asmdefs.h"
+/*.data +.space xxx, 0 */
+.section .rtasentry,"ax"
- /* real mode! */
+GLOBL(_entry):
- /*
* r3 = arguments
* r4 = private memory
*/
- stw r1, 0(r4)
- stw r2, 4(r4)
- stwu r1, -12(r1)
Is the os guaranteed to give you r4 and r1 for free scribbling over?
- stw r4, 8(r1)
- mflr r0
- stw r0, 4(r1)
/* saving non-volatile registers */
- stw r13, 8(r4)
I would recommend multiplying here:
stw r13, (3 * 4)(r4)
That makes it more readable.
- stw r14, 12(r4)
- stw r15, 16(r4)
- stw r16, 20(r4)
- stw r17, 24(r4)
- stw r18, 28(r4)
- stw r19, 32(r4)
- stw r20, 36(r4)
- stw r21, 40(r4)
- stw r22, 44(r4)
- stw r23, 48(r4)
- stw r24, 52(r4)
- stw r25, 56(r4)
- stw r26, 60(r4)
- stw r27, 64(r4)
- stw r28, 68(r4)
- stw r29, 72(r4)
- stw r30, 76(r4)
- stw r31, 80(r4)
- mfcr r2
- stw r2, 84(r4)
- bl rtas_interface
So rtas_interface gets the os given r3 as first parameter. Please make sure that struct is packed then.
- lwz r0, 4(r1)
- mtlr r0
- lwz r4, 8(r1)
+// lwz r1, 0(r1)
- lwz r2, 84(r4)
- mtcr r2
If the abi is similar to the normal C one, cr is volatile.
Alex
On 15.10.2010, at 00:38, Alexander Graf wrote:
Am 15.10.2010 um 00:17 schrieb Andreas Färber andreas.faerber@web.de:
[snip]
/* saving non-volatile registers */
- stw r13, 8(r4)
I would recommend multiplying here:
stw r13, (3 * 4)(r4)
That makes it more readable.
- stw r14, 12(r4)
- stw r15, 16(r4)
- stw r16, 20(r4)
- stw r17, 24(r4)
- stw r18, 28(r4)
- stw r19, 32(r4)
- stw r20, 36(r4)
- stw r21, 40(r4)
- stw r22, 44(r4)
- stw r23, 48(r4)
- stw r24, 52(r4)
- stw r25, 56(r4)
- stw r26, 60(r4)
- stw r27, 64(r4)
- stw r28, 68(r4)
- stw r29, 72(r4)
- stw r30, 76(r4)
- stw r31, 80(r4)
Actually thinking about this a bit more, r13-r31 are already defined non-volatile by the C ABI, so you can be sure that the C function you're calling doesn't clobber them. You don't need to manually save/restore them :).
Alex
Am 15.10.2010 um 12:26 schrieb Alexander Graf:
On 15.10.2010, at 00:38, Alexander Graf wrote:
Am 15.10.2010 um 00:17 schrieb Andreas Färber andreas.faerber@web.de:
[snip]
/* saving non-volatile registers */
- stw r13, 8(r4)
I would recommend multiplying here:
stw r13, (3 * 4)(r4)
That makes it more readable.
- stw r14, 12(r4)
- stw r15, 16(r4)
- stw r16, 20(r4)
- stw r17, 24(r4)
- stw r18, 28(r4)
- stw r19, 32(r4)
- stw r20, 36(r4)
- stw r21, 40(r4)
- stw r22, 44(r4)
- stw r23, 48(r4)
- stw r24, 52(r4)
- stw r25, 56(r4)
- stw r26, 60(r4)
- stw r27, 64(r4)
- stw r28, 68(r4)
- stw r29, 72(r4)
- stw r30, 76(r4)
- stw r31, 80(r4)
Actually thinking about this a bit more, r13-r31 are already defined non-volatile by the C ABI, so you can be sure that the C function you're calling doesn't clobber them. You don't need to manually save/ restore them :).
Alex
-- OpenBIOS http://openbios.org/ Mailinglist: http://lists.openbios.org/mailman/listinfo Free your System - May the Forth be with you
[2nd try]
Am 15.10.2010 um 12:26 schrieb Alexander Graf:
On 15.10.2010, at 00:38, Alexander Graf wrote:
/* saving non-volatile registers */
- stw r13, 8(r4)
I would recommend multiplying here:
stw r13, (3 * 4)(r4)
That makes it more readable.
- stw r14, 12(r4)
- stw r15, 16(r4)
- stw r16, 20(r4)
- stw r17, 24(r4)
- stw r18, 28(r4)
- stw r19, 32(r4)
- stw r20, 36(r4)
- stw r21, 40(r4)
- stw r22, 44(r4)
- stw r23, 48(r4)
- stw r24, 52(r4)
- stw r25, 56(r4)
- stw r26, 60(r4)
- stw r27, 64(r4)
- stw r28, 68(r4)
- stw r29, 72(r4)
- stw r30, 76(r4)
- stw r31, 80(r4)
Actually thinking about this a bit more, r13-r31 are already defined non-volatile by the C ABI, so you can be sure that the C function you're calling doesn't clobber them. You don't need to manually save/ restore them :).
The OF client interface saves r5-r31 as well as ctr, cr and xer. It shouldn't hurt to save more registers than necessary to assert the requirements of the CHRP spec [1]:
<<< 7.2.2 Register Usage
Requirements:
7–10. Except as required by a specific function, RTAS must not modify the following operating environment registers: TB, DEC, SPRG0-SPRG3, EAR, DABR, SDR1, ASR, SR0-SR15, FPSCR, FPR0-FPR3, and any processor specific registers.
7-11. RTAS must preserve the following user mode registers: R1-R2, R13- R31, and CR.
7–12. RTAS must preserve the following operating environment registers: MSR, DAR, DSISR, IBAT0-IBAT3, and DBAT0-DBAT3.
Andreas
[1] ftp://ftp.software.ibm.com/rs6000/technology/spec/chrp/ hrpa_103.ps.Z (p.94)
On 15.10.2010, at 22:22, Andreas Färber wrote:
[2nd try]
Am 15.10.2010 um 12:26 schrieb Alexander Graf:
On 15.10.2010, at 00:38, Alexander Graf wrote:
/* saving non-volatile registers */
- stw r13, 8(r4)
I would recommend multiplying here:
stw r13, (3 * 4)(r4)
That makes it more readable.
- stw r14, 12(r4)
- stw r15, 16(r4)
- stw r16, 20(r4)
- stw r17, 24(r4)
- stw r18, 28(r4)
- stw r19, 32(r4)
- stw r20, 36(r4)
- stw r21, 40(r4)
- stw r22, 44(r4)
- stw r23, 48(r4)
- stw r24, 52(r4)
- stw r25, 56(r4)
- stw r26, 60(r4)
- stw r27, 64(r4)
- stw r28, 68(r4)
- stw r29, 72(r4)
- stw r30, 76(r4)
- stw r31, 80(r4)
Actually thinking about this a bit more, r13-r31 are already defined non-volatile by the C ABI, so you can be sure that the C function you're calling doesn't clobber them. You don't need to manually save/restore them :).
The OF client interface saves r5-r31 as well as ctr, cr and xer. It shouldn't hurt to save more registers than necessary to assert the requirements of the CHRP spec [1]:
<<< 7.2.2 Register Usage
Requirements:
7–10. Except as required by a specific function, RTAS must not modify the following operating environment registers: TB, DEC, SPRG0-SPRG3, EAR, DABR, SDR1, ASR, SR0-SR15, FPSCR, FPR0-FPR3, and any processor specific registers.
Read: no timers, no paging, no modification of page fault handler registers
7-11. RTAS must preserve the following user mode registers: R1-R2, R13-R31, and CR.
Except for CR, C is the same. So you really only need to save/restore cr :).
7–12. RTAS must preserve the following operating environment registers: MSR, DAR, DSISR, IBAT0-IBAT3, and DBAT0-DBAT3.
Read: You're allowed to change direct mappings, but need to restore them when exiting again. Interesting :).
Alex
Am 15.10.2010 um 22:28 schrieb Alexander Graf:
On 15.10.2010, at 22:22, Andreas Färber wrote:
Am 15.10.2010 um 12:26 schrieb Alexander Graf:
On 15.10.2010, at 00:38, Alexander Graf wrote:
/* saving non-volatile registers */
- stw r13, 8(r4)
I would recommend multiplying here:
stw r13, (3 * 4)(r4)
That makes it more readable.
- stw r14, 12(r4)
- stw r15, 16(r4)
- stw r16, 20(r4)
- stw r17, 24(r4)
- stw r18, 28(r4)
- stw r19, 32(r4)
- stw r20, 36(r4)
- stw r21, 40(r4)
- stw r22, 44(r4)
- stw r23, 48(r4)
- stw r24, 52(r4)
- stw r25, 56(r4)
- stw r26, 60(r4)
- stw r27, 64(r4)
- stw r28, 68(r4)
- stw r29, 72(r4)
- stw r30, 76(r4)
- stw r31, 80(r4)
Actually thinking about this a bit more, r13-r31 are already defined non-volatile by the C ABI, so you can be sure that the C function you're calling doesn't clobber them. You don't need to manually save/restore them :).
The OF client interface saves r5-r31 as well as ctr, cr and xer. It shouldn't hurt to save more registers than necessary to assert the requirements of the CHRP spec [1]:
<<< 7.2.2 Register Usage
Requirements:
7–10. Except as required by a specific function, RTAS must not modify the following operating environment registers: TB, DEC, SPRG0-SPRG3, EAR, DABR, SDR1, ASR, SR0-SR15, FPSCR, FPR0-FPR3, and any processor specific registers.
Read: no timers, no paging, no modification of page fault handler registers
7-11. RTAS must preserve the following user mode registers: R1-R2, R13-R31, and CR.
Except for CR, C is the same. So you really only need to save/ restore cr :).
Huh? Doesn't that depend on the ABI used rather than on C? If someone uses a differently configured GCC (or clang or ...), such assumptions might not hold.
What about r3-r4? The SysV ABI ppc supplement has them listed as volatile, so if I want to continue to access the private data area, I still need to save and restore r4, no?
Andreas
7–12. RTAS must preserve the following operating environment registers: MSR, DAR, DSISR, IBAT0-IBAT3, and DBAT0-DBAT3.
Read: You're allowed to change direct mappings, but need to restore them when exiting again. Interesting :).
Alex
On 15.10.2010, at 22:56, Andreas Färber wrote:
Am 15.10.2010 um 22:28 schrieb Alexander Graf:
On 15.10.2010, at 22:22, Andreas Färber wrote:
Am 15.10.2010 um 12:26 schrieb Alexander Graf:
On 15.10.2010, at 00:38, Alexander Graf wrote:
/* saving non-volatile registers */
- stw r13, 8(r4)
I would recommend multiplying here:
stw r13, (3 * 4)(r4)
That makes it more readable.
- stw r14, 12(r4)
- stw r15, 16(r4)
- stw r16, 20(r4)
- stw r17, 24(r4)
- stw r18, 28(r4)
- stw r19, 32(r4)
- stw r20, 36(r4)
- stw r21, 40(r4)
- stw r22, 44(r4)
- stw r23, 48(r4)
- stw r24, 52(r4)
- stw r25, 56(r4)
- stw r26, 60(r4)
- stw r27, 64(r4)
- stw r28, 68(r4)
- stw r29, 72(r4)
- stw r30, 76(r4)
- stw r31, 80(r4)
Actually thinking about this a bit more, r13-r31 are already defined non-volatile by the C ABI, so you can be sure that the C function you're calling doesn't clobber them. You don't need to manually save/restore them :).
The OF client interface saves r5-r31 as well as ctr, cr and xer. It shouldn't hurt to save more registers than necessary to assert the requirements of the CHRP spec [1]:
<<< 7.2.2 Register Usage
Requirements:
7–10. Except as required by a specific function, RTAS must not modify the following operating environment registers: TB, DEC, SPRG0-SPRG3, EAR, DABR, SDR1, ASR, SR0-SR15, FPSCR, FPR0-FPR3, and any processor specific registers.
Read: no timers, no paging, no modification of page fault handler registers
7-11. RTAS must preserve the following user mode registers: R1-R2, R13-R31, and CR.
Except for CR, C is the same. So you really only need to save/restore cr :).
Huh? Doesn't that depend on the ABI used rather than on C? If someone uses a differently configured GCC (or clang or ...), such assumptions might not hold.
I thought we're using the Linux ABI internally?
What about r3-r4? The SysV ABI ppc supplement has them listed as volatile, so if I want to continue to access the private data area, I still need to save and restore r4, no?
r3 and r4 are volatile, yes. But those are not listed in the paragraphs here either :).
Alex
Am 16.10.2010 um 10:50 schrieb Alexander Graf:
On 15.10.2010, at 22:56, Andreas Färber wrote:
Am 15.10.2010 um 22:28 schrieb Alexander Graf:
On 15.10.2010, at 22:22, Andreas Färber wrote:
7-11. RTAS must preserve the following user mode registers: R1- R2, R13-R31, and CR.
Except for CR, C is the same. So you really only need to save/ restore cr :).
Huh? Doesn't that depend on the ABI used rather than on C? If someone uses a differently configured GCC (or clang or ...), such assumptions might not hold.
I thought we're using the Linux ABI internally?
All I can say for sure is that I'm using --target=powerpc-elf-, which together with powerpc-linux-gnu- and powerpc-eabi- is one of the cross- compilers switch-arch allows for qemu-ppc.
--target=powerpc[64]-linux[-gnu] GCCs 4.4-4.6 don't build for Blue and me, cf. thread "starting". Any hints welcome!
Andreas
What about r3-r4? The SysV ABI ppc supplement has them listed as volatile, so if I want to continue to access the private data area, I still need to save and restore r4, no?
r3 and r4 are volatile, yes. But those are not listed in the paragraphs here either :).
Alex
On Sat, Oct 16, 2010 at 9:26 AM, Andreas Färber andreas.faerber@web.de wrote:
Am 16.10.2010 um 10:50 schrieb Alexander Graf:
On 15.10.2010, at 22:56, Andreas Färber wrote:
Am 15.10.2010 um 22:28 schrieb Alexander Graf:
On 15.10.2010, at 22:22, Andreas Färber wrote:
7-11. RTAS must preserve the following user mode registers: R1-R2, R13-R31, and CR.
Except for CR, C is the same. So you really only need to save/restore cr :).
Huh? Doesn't that depend on the ABI used rather than on C? If someone uses a differently configured GCC (or clang or ...), such assumptions might not hold.
I thought we're using the Linux ABI internally?
All I can say for sure is that I'm using --target=powerpc-elf-, which together with powerpc-linux-gnu- and powerpc-eabi- is one of the cross-compilers switch-arch allows for qemu-ppc.
--target=powerpc[64]-linux[-gnu] GCCs 4.4-4.6 don't build for Blue and me, cf. thread "starting". Any hints welcome!
I didn't try powerpc-linux (since powerpc-elf-gcc works) but powerpc64-linux (which was the one that didn't build along with powerpc64-elf). $ powerpc-elf-gcc -v Using built-in specs. COLLECT_GCC=powerpc-elf-gcc COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/powerpc-elf/4.6.0/lto-wrapper Target: powerpc-elf Configured with: ../configure --target=powerpc-elf --enable-targets=powerpc-elf --disable-nls --disable-threads --enable-languages=c --disable-shared --disable-libssp --disable-multilib Thread model: single gcc version 4.6.0 20100925 (experimental) (GCC)
Am 15.10.2010 um 00:38 schrieb Alexander Graf:
Am 15.10.2010 um 00:17 schrieb Andreas Färber andreas.faerber@web.de:
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index 2b0b891..6d72386 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c
@@ -745,10 +749,11 @@ arch_of_init( void ) printk("Warning: No /rtas node\n"); else { unsigned long size = 0x1000;
while( size < (unsigned long)of_rtas_end - (unsigned
long)of_rtas_start )
while ( size < rtas_size ) size *= 2; set_property( ph, "rtas-size", (char*)&size, sizeof(size) ); set_int_property(ph, "rtas-version", 1);
Didn't you just set this to 0x41?
No. My Mac has it as 0x41 (or 41?). The JS20 tree and the spec has it as 1.
Locally I'm using ARCH_CHRP_U3 for differentiation, setting up the / rtas node for chrp only, for instance. The machine enum needs to the sync'ed with QEMU though. Would a patch adding just the enum member (i.e., reserving the value 4) have anything chance of being applied without accompanying fully-complete-and-reviewed machine? My chrp machine is just a stripped-down version of mac99 for now, some memory locations would need to change once we have OpenBIOS/ppc64 compiling and some devices like mac-io would need to be exchanged. Blue?
diff --git a/arch/ppc/qemu/rtas-tokens.c b/arch/ppc/qemu/rtas- tokens.c new file mode 100644 index 0000000..f251716 --- /dev/null +++ b/arch/ppc/qemu/rtas-tokens.c @@ -0,0 +1,63 @@ +/*
- Copyright (c) 2010 Andreas Färber andreas.faerber@web.de
- */
This is missing a copyleft license.
Yeah, and an SoB. Not yet ready for committing.
+#define RTAS_MAX_ARGS 10
+typedef struct rtas_args {
- unsigned long token;
- long nargs;
- long nret;
- unsigned long args[RTAS_MAX_ARGS];
+} rtas_args_t;
+void rtas_interface(rtas_args_t*, void*);
+/* drivers/escc.h */ +#define IO_ESCC_OFFSET 0x00013000 +/* drivers/escc.c */ +#define CTRL(addr) (*(volatile unsigned char *)(addr)) +#define DATA(addr) (*(volatile unsigned char *)(addr + 16)) +#define Tx_BUF_EMP 0x4 /* Tx Buffer empty */
+/*static void uart_putchar(int port, unsigned char c) +{
- while (!(CTRL(port) & Tx_BUF_EMP))
;
- DATA(port) = c;
+}*/
+/*void serial_putchar(char);*/
+static void serial_putchar(char c) +{
- unsigned long addr = 0x80800000;
Phew - how is this done for the normal escc case?
In OpenBIOS, the address is passed in from PCI code.
In /rtas, there's a "display-device" property that specifies the phandle to be used for display-character. We would have to read this and pass any addresses to RTAS as part of instantiate-rtas.
We should at least share the constants here.
As documented above, most #defines were in the source file and would need to be moved to a header file for sharing.
Here, I just wanted a quick way to trace whether my code was being called. OpenBIOS' escc code is very OF-specific, and the JS20 doesn't have a mac-io/cuda/escc device anyway. We might be able to split some core I/O functions like these off.
+GLOBL(_entry):
- /*
* r3 = arguments
* r4 = private memory
*/
- stw r1, 0(r4)
- stw r2, 4(r4)
- stwu r1, -12(r1)
Is the os guaranteed to give you r4 and r1 for free scribbling over?
r4 is a pointer to the memory it claimed for us. My reservation of 0x1000 initial bytes in rtas_initialize() leaves some space for saving state and passing in the shared struct you suggested.
- stw r4, 8(r1)
- mflr r0
- stw r0, 4(r1)
/* saving non-volatile registers */
- stw r13, 8(r4)
I would recommend multiplying here:
stw r13, (3 * 4)(r4)
That makes it more readable.
I actually adapted this from start.S for the client interface.
stwu x, 4(y) on a volatile register would be ever better imo. It was a nightly proof of concept! ;)
Question: The CIF does irritating things with r1 and r4 and saved_stack - why? Would it be better to save everything on the stack here?
- bl rtas_interface
So rtas_interface gets the os given r3 as first parameter. Please make sure that struct is packed then.
Good catch.
- lwz r0, 4(r1)
- mtlr r0
- lwz r4, 8(r1)
+// lwz r1, 0(r1)
- lwz r2, 84(r4)
- mtcr r2
If the abi is similar to the normal C one, cr is volatile.
It definitely isn't. I believe the CIF preserves it, too.
Andreas
Am 15.10.2010 um 00:17 schrieb Andreas Färber:
diff --git a/arch/ppc/qemu/rtas-tokens.c b/arch/ppc/qemu/rtas-tokens.c new file mode 100644 index 0000000..f251716 --- /dev/null +++ b/arch/ppc/qemu/rtas-tokens.c @@ -0,0 +1,63 @@ +/*
- Copyright (c) 2010 Andreas Färber andreas.faerber@web.de
- */
+#define RTAS_MAX_ARGS 10
+typedef struct rtas_args {
- unsigned long token;
- long nargs;
- long nret;
- unsigned long args[RTAS_MAX_ARGS];
+} rtas_args_t;
+void rtas_interface(rtas_args_t*, void*);
+/* drivers/escc.h */ +#define IO_ESCC_OFFSET 0x00013000 +/* drivers/escc.c */ +#define CTRL(addr) (*(volatile unsigned char *)(addr)) +#define DATA(addr) (*(volatile unsigned char *)(addr + 16)) +#define Tx_BUF_EMP 0x4 /* Tx Buffer empty */
+/*static void uart_putchar(int port, unsigned char c) +{
- while (!(CTRL(port) & Tx_BUF_EMP))
;
- DATA(port) = c;
+}*/
+/*void serial_putchar(char);*/
+static void serial_putchar(char c) +{
- unsigned long addr = 0x80800000;
- volatile unsigned char *serial_dev = (unsigned char *)addr +
IO_ESCC_OFFSET + 0x20;
- //uart_putchar((int)serial_dev, c);
- volatile unsigned char * port = serial_dev;
- while (!(CTRL(port) & Tx_BUF_EMP))
;
- DATA(port) = c;
+}
If I add a new method void dprintk(const char* s) here, I get this:
CC target/arch/ppc/qemu/rtas-tokens.o LINK rtas-qemu.bin target/arch/ppc/qemu/rtas-tokens.o: In function `rtas_interface': /Users/andreas/QEMU/OpenBIOS/openbios/obj-ppc/../arch/ppc/qemu/rtas- tokens.c:51: undefined reference to `_GLOBAL_OFFSET_TABLE_' make: *** [rtas-qemu.bin] Error 1
Any ideas? Isn't the GOT some ELF concept?
$ powerpc-elf-gcc -v Using built-in specs. COLLECT_GCC=powerpc-elf-gcc COLLECT_LTO_WRAPPER=/Users/andreas/QEMU/OpenBIOS/bin/libexec/gcc/ powerpc-elf/4.5.1/lto-wrapper Target: powerpc-elf Configured with: ../gcc-4.5.1/configure --prefix=/Users/andreas/QEMU/ OpenBIOS/bin --target=powerpc-elf --disable-nls --disable-threads -- enable-languages=c --disable-shared --disable-libssp --with-gmp=/Users/ andreas/QEMU/OpenBIOS/bin --with-mpfr=/Users/andreas/QEMU/OpenBIOS/bin Thread model: single gcc version 4.5.1 (GCC)
Andreas
+enum {
- DISPLAY_CHARACTER = 1,
+};
+void rtas_interface(rtas_args_t* params, void* privateData) +{
- switch (params->token) {
case DISPLAY_CHARACTER: {
serial_putchar((char)params->args[0]);
serial_putchar('x');
params->args[params->nargs] = 0;
break;
}
default:
serial_putchar('.');
params->args[params->nargs] = -1;
break;
- }
- serial_putchar('\r');
- serial_putchar('\n');
+}
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
You can use the apple gdb without an object file, so you don't get symbols. But if you have an instruction pointer, just
$ qemu-system-ppc -s -S ... (gdb) target remote localhost:1234 (gdb) b *0x1234 <- address of rtas_something (gdb) c
It should break on that IP and then you can evaluate the register contents at least. Either by
(gdb) info registers
or
(qemu) info registers
I'm trying to find out how far we get with the ppc64 OpenBIOS, so I've tried the following:
$ .../ppc64-softmmu/qemu-system-ppc64 ... -nographic -prom-env 'auto- boot?=false' -s -S
$ gdb --arch=ppc64 GNU gdb 6.3.50-20050815 (Apple version gdb-967) (Tue Jul 14 02:15:14 UTC 2009) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "powerpc-apple-darwin". (gdb) target remote localhost:1234 Remote debugging using localhost:1234 [New thread 1] 0x0000000000000000 in ?? () (gdb) b *0xfffffffc Breakpoint 1 at 0xfffffffc (gdb) c Continuing.
It doesn't break though and executes to the OpenBIOS prompt. 0xfffffffc is supposed to be the hard reset vector, i.e. the very first instruction it must execute to branch to _entry.
Any suggestions?
Andreas
On 31.10.2010, at 16:33, Andreas Färber wrote:
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
You can use the apple gdb without an object file, so you don't get symbols. But if you have an instruction pointer, just
$ qemu-system-ppc -s -S ... (gdb) target remote localhost:1234 (gdb) b *0x1234 <- address of rtas_something (gdb) c
It should break on that IP and then you can evaluate the register contents at least. Either by
(gdb) info registers
or
(qemu) info registers
I'm trying to find out how far we get with the ppc64 OpenBIOS, so I've tried the following:
$ .../ppc64-softmmu/qemu-system-ppc64 ... -nographic -prom-env 'auto-boot?=false' -s -S
$ gdb --arch=ppc64 GNU gdb 6.3.50-20050815 (Apple version gdb-967) (Tue Jul 14 02:15:14 UTC 2009) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "powerpc-apple-darwin". (gdb) target remote localhost:1234 Remote debugging using localhost:1234 [New thread 1] 0x0000000000000000 in ?? () (gdb) b *0xfffffffc Breakpoint 1 at 0xfffffffc (gdb) c Continuing.
It doesn't break though and executes to the OpenBIOS prompt. 0xfffffffc is supposed to be the hard reset vector, i.e. the very first instruction it must execute to branch to _entry.
Uuuh IIRC there's a register that's set on RESET which defines an offset to take when in real mode code or so. Please check the cpu init code for 970, it should tell you :)
Alex
Am 01.11.2010 um 00:53 schrieb Alexander Graf:
On 31.10.2010, at 16:33, Andreas Färber wrote:
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
You can use the apple gdb without an object file, so you don't get symbols. But if you have an instruction pointer, just
$ qemu-system-ppc -s -S ... (gdb) target remote localhost:1234 (gdb) b *0x1234 <- address of rtas_something (gdb) c
It should break on that IP and then you can evaluate the register contents at least. Either by
(gdb) info registers
or
(qemu) info registers
I'm trying to find out how far we get with the ppc64 OpenBIOS, so I've tried the following:
$ .../ppc64-softmmu/qemu-system-ppc64 ... -nographic -prom-env 'auto-boot?=false' -s -S
$ gdb --arch=ppc64 GNU gdb 6.3.50-20050815 (Apple version gdb-967) (Tue Jul 14 02:15:14 UTC 2009) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "powerpc-apple-darwin". (gdb) target remote localhost:1234 Remote debugging using localhost:1234 [New thread 1] 0x0000000000000000 in ?? () (gdb) b *0xfffffffc Breakpoint 1 at 0xfffffffc (gdb) c Continuing.
It doesn't break though and executes to the OpenBIOS prompt. 0xfffffffc is supposed to be the hard reset vector, i.e. the very first instruction it must execute to branch to _entry.
Uuuh IIRC there's a register that's set on RESET which defines an offset to take when in real mode code or so. Please check the cpu init code for 970, it should tell you :)
Hm, not sure what init code that would be...
Found that the following works slightly better:
(gdb) x/i 0xfffffffc 0xfffffffc: bl 0xfff02378 (gdb) b *0xfff02378 Breakpoint 1 at 0xfff02378 (gdb) c Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap. 0x0000000000000000 in ?? () (gdb)
So it seems it's getting there. :) All registers including pc are zero though according to info registers, x/10i $pc just shows .long 0x0, and stepi, step, next all don't seem to work. (What worked was disable 1 followed by cont.) Is this something with the host gdb or is something wrong in QEMU, or is the problem in front of the keyboard?
Andreas
On 31.10.2010, at 17:19, Andreas Färber wrote:
Am 01.11.2010 um 00:53 schrieb Alexander Graf:
On 31.10.2010, at 16:33, Andreas Färber wrote:
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
You can use the apple gdb without an object file, so you don't get symbols. But if you have an instruction pointer, just
$ qemu-system-ppc -s -S ... (gdb) target remote localhost:1234 (gdb) b *0x1234 <- address of rtas_something (gdb) c
It should break on that IP and then you can evaluate the register contents at least. Either by
(gdb) info registers
or
(qemu) info registers
I'm trying to find out how far we get with the ppc64 OpenBIOS, so I've tried the following:
$ .../ppc64-softmmu/qemu-system-ppc64 ... -nographic -prom-env 'auto-boot?=false' -s -S
$ gdb --arch=ppc64 GNU gdb 6.3.50-20050815 (Apple version gdb-967) (Tue Jul 14 02:15:14 UTC 2009) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "powerpc-apple-darwin". (gdb) target remote localhost:1234 Remote debugging using localhost:1234 [New thread 1] 0x0000000000000000 in ?? () (gdb) b *0xfffffffc Breakpoint 1 at 0xfffffffc (gdb) c Continuing.
It doesn't break though and executes to the OpenBIOS prompt. 0xfffffffc is supposed to be the hard reset vector, i.e. the very first instruction it must execute to branch to _entry.
Uuuh IIRC there's a register that's set on RESET which defines an offset to take when in real mode code or so. Please check the cpu init code for 970, it should tell you :)
Hm, not sure what init code that would be...
That's the one I meant:
env->hreset_excp_prefix = 0x00000000FFF00000ULL; /* Hardware reset vector */ env->hreset_vector = 0x0000000000000100ULL;
So it starts at 0x100, not -4 :). Not sure if it's start at 0xfff00100, I'd have to double-check all the code paths.
Found that the following works slightly better:
(gdb) x/i 0xfffffffc 0xfffffffc: bl 0xfff02378 (gdb) b *0xfff02378 Breakpoint 1 at 0xfff02378 (gdb) c Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap. 0x0000000000000000 in ?? () (gdb)
So it seems it's getting there. :) All registers including pc are zero though according to info registers, x/10i $pc just shows .long 0x0, and stepi, step, next all don't seem to work. (What worked was disable 1 followed by cont.) Is this something with the host gdb or is something wrong in QEMU, or is the problem in front of the keyboard?
What does info registers in the qemu monitor say?
Alex
Am 01.11.2010 um 01:25 schrieb Alexander Graf:
On 31.10.2010, at 17:19, Andreas Färber wrote:
Am 01.11.2010 um 00:53 schrieb Alexander Graf:
On 31.10.2010, at 16:33, Andreas Färber wrote:
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
You can use the apple gdb without an object file, so you don't get symbols. But if you have an instruction pointer, just
$ qemu-system-ppc -s -S ... (gdb) target remote localhost:1234 (gdb) b *0x1234 <- address of rtas_something (gdb) c
It should break on that IP and then you can evaluate the register contents at least. Either by
(gdb) info registers
or
(qemu) info registers
I'm trying to find out how far we get with the ppc64 OpenBIOS, so I've tried the following:
$ .../ppc64-softmmu/qemu-system-ppc64 ... -nographic -prom-env 'auto-boot?=false' -s -S
$ gdb --arch=ppc64 GNU gdb 6.3.50-20050815 (Apple version gdb-967) (Tue Jul 14 02:15:14 UTC 2009) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "powerpc-apple-darwin". (gdb) target remote localhost:1234 Remote debugging using localhost:1234 [New thread 1] 0x0000000000000000 in ?? () (gdb) b *0xfffffffc Breakpoint 1 at 0xfffffffc (gdb) c Continuing.
It doesn't break though and executes to the OpenBIOS prompt. 0xfffffffc is supposed to be the hard reset vector, i.e. the very first instruction it must execute to branch to _entry.
Uuuh IIRC there's a register that's set on RESET which defines an offset to take when in real mode code or so. Please check the cpu init code for 970, it should tell you :)
Hm, not sure what init code that would be...
That's the one I meant:
env->hreset_excp_prefix = 0x00000000FFF00000ULL; /* Hardware reset vector */ env->hreset_vector = 0x0000000000000100ULL;
So it starts at 0x100, not -4 :). Not sure if it's start at 0xfff00100,
Bingo. b *0xfff00100 makes it stop at the breakpoint.
All registers including pc are zero though according to info registers, x/10i $pc just shows .long 0x0, and stepi, step, next all don't seem to work. (What worked was disable 1 followed by cont.) Is this something with the host gdb or is something wrong in QEMU, or is the problem in front of the keyboard?
What does info registers in the qemu monitor say?
Looking better:
(qemu) info registers NIP 00000000fff00100 LR 0000000000000000 CTR 0000000000000000 XER 0000000000000000 MSR 8000000000000000 HID0 0000000060000000 HF 8000000000000000 idx 1 TB 00000000 00970400 DECR 4293996894 GPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000 CR 00000000 [ - - - - - - - - ] RES ffffffffffffffff FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPSCR 00000000 SRR0 0000000000000000 SRR1 0000000000000000 SDR1 0000000000000000 (qemu)
But:
(gdb) stepi 0x0000000000000000 in ?? () (gdb)
(qemu) info registers NIP 00000000fff00100 LR 0000000000000000 CTR 0000000000000000 XER 0000000000000000 MSR 8000000000000000 HID0 0000000060000000 HF 8000000000000000 idx 1 TB 00000000 00876800 DECR 4294090494 GPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000 CR 00000000 [ - - - - - - - - ] RES ffffffffffffffff FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000 FPSCR 00000000 SRR0 0000000000000000 SRR1 0000000000000000 SDR1 0000000000000000 (qemu)
TB and DECR changed but NIP is still at 0xfff00100 and not at 0xfff02378 (0xfff00100: b 0xfff02378).
Same for powerpc64-linux-gnu-gdb 7.2, except that file .../obj-ppc/ openbios-qemu.elf.nostrip works there.
It seems it's the combination of ppc64-softmmu and 32-bit obj-ppc/ openbios-qemu.elf that doesn't work. For obj-ppc64/openbios-qemu.elf stepi and cont work just fine!
=> We arrive in compute_ramsize; my hunch is that the bl setup_mmu needs to be changed to dereference the function descriptor on ppc64.
Andreas
Andreas Färber wrote:
I'm trying to find out how far we get with the ppc64 OpenBIOS, so I've tried the following:
$ .../ppc64-softmmu/qemu-system-ppc64 ... -nographic -prom-env 'auto-boot?=false' -s -S
$ gdb --arch=ppc64 GNU gdb 6.3.50-20050815 (Apple version gdb-967) (Tue Jul 14 02:15:14 UTC 2009) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "powerpc-apple-darwin". (gdb) target remote localhost:1234 Remote debugging using localhost:1234 [New thread 1] 0x0000000000000000 in ?? () (gdb) b *0xfffffffc Breakpoint 1 at 0xfffffffc (gdb) c Continuing.
It doesn't break though and executes to the OpenBIOS prompt. 0xfffffffc is supposed to be the hard reset vector, i.e. the very first instruction it must execute to branch to _entry.
Any suggestions?
FWIW I had a problem with SPARC64 crashing gdb when I tried to step through the initialisation, and in the end I found out that everyone worked once the MMU had been initialised. So I ended up putting a new label into entry.S just after the MMU had been enabled and breaking there, at which point everything seemed to behave.
HTH,
Mark.
Am 01.11.2010 um 12:01 schrieb Mark Cave-Ayland:
FWIW I had a problem with SPARC64 crashing gdb when I tried to step through the initialisation, and in the end I found out that everyone worked once the MMU had been initialised. So I ended up putting a new label into entry.S just after the MMU had been enabled and breaking there, at which point everything seemed to behave.
Thanks for the suggestion. There's been a certain lag in mail delivery...
Latest state with local patches is that hell breaks loose once the MMU is set up. I get a 0x400 (ISI) exception and when the bctrl to isi_exception() is executed, we end up at trap_error, where it branches to unexpected_excep() and tries to printk() to the serial port that's not yet set up. I'll put a few patches together.
Andreas
Am 01.11.2010 um 17:36 schrieb Andreas Färber:
Latest state with local patches is that hell breaks loose once the MMU is set up. I get a 0x400 (ISI) exception and when the bctrl to isi_exception() is executed, we end up at trap_error, where it branches to unexpected_excep() and tries to printk() to the serial port that's not yet set up. I'll put a few patches together.
Since r945 everything except for the trampoline issue should be in SVN.
I've made no more progress throughout the week though:
Directly after we set the MSR_IR|MSR_DR bits in the MSR (arch/ppc/qemu/ ofmem.c:setup_mmu), we get an ISI exception and end up in arch/ppc/ qemu/start.S:vector__0x400 (the 0xfffxxxxx one). We proceed up to the bctrl which should take us to arch/ppc/qemu/ofmem.c:isi_exception, but then get a 0x700 program exception. The value in ctr looks sensible, it's some 0xfffxxxxx address.
i) I read that mtsrin were not allowed in 64-bit mode and its results unpredictable, so I tried switching MSR_SF off before and back on after the loop, without luck.
ii) If I exit the setup_mmu() function without turning the MMU on, we proceed to arch/ppc/qemu/init.c:entry() but are unsuccessful reading the magic fw_cfg signature. Stepping through the code it seemed as if some variable assignments like in drivers/fw_cfg.c:fw_cfg_init() were having no effect - could that be due to OpenBIOS code execution happening in ROM rather than ea_to_phys()-mapped to RAM? (i.e., write- only storage?:)) Or would this be some memory caching issue for the fw_cfg ports?
iii) Before turning on the MMU, I tried implementing the early-mapping of pages by calling hash_page() from ofmem_arch_early_map_pages() and calling ofmem_map() for the ROM-to-RAM translation and for identity- mapping the code. This leads to a hang in libopenbios/ ofmem_common.c:ofmem_update_memory_available() in a code path (a printk in ofmem_realloc()) that would normally only be taken if libopenbios/ofmem_common.c:s_phandle_memory were non-zero, at a point where it should still be zero.
Any clue why ppc works but ppc64 doesn't?
Thanks, Andreas
Andreas Färber wrote:
Since r945 everything except for the trampoline issue should be in SVN.
Cool! :)
I've made no more progress throughout the week though:
Directly after we set the MSR_IR|MSR_DR bits in the MSR (arch/ppc/qemu/ofmem.c:setup_mmu), we get an ISI exception and end up in arch/ppc/qemu/start.S:vector__0x400 (the 0xfffxxxxx one). We proceed up to the bctrl which should take us to arch/ppc/qemu/ofmem.c:isi_exception, but then get a 0x700 program exception. The value in ctr looks sensible, it's some 0xfffxxxxx address.
Hmmm this sounds similar to a SPARC32 issue I was finding over the weekend whereby everything died after the MMU was enabled because the context table wasn't correctly aligned. Could it be possible that the MMU hash tables aren't aligned correctly in memory?
ATB,
Mark.
On Mon, Nov 8, 2010 at 9:34 AM, Mark Cave-Ayland mark.cave-ayland@siriusit.co.uk wrote:
Andreas Färber wrote:
Since r945 everything except for the trampoline issue should be in SVN.
Cool! :)
I've made no more progress throughout the week though:
Directly after we set the MSR_IR|MSR_DR bits in the MSR (arch/ppc/qemu/ofmem.c:setup_mmu), we get an ISI exception and end up in arch/ppc/qemu/start.S:vector__0x400 (the 0xfffxxxxx one). We proceed up to the bctrl which should take us to arch/ppc/qemu/ofmem.c:isi_exception, but then get a 0x700 program exception. The value in ctr looks sensible, it's some 0xfffxxxxx address.
Hmmm this sounds similar to a SPARC32 issue I was finding over the weekend whereby everything died after the MMU was enabled because the context table wasn't correctly aligned. Could it be possible that the MMU hash tables aren't aligned correctly in memory?
Also the alignment may concern physical, not virtual addresses.
Am 08.11.2010 um 10:34 schrieb Mark Cave-Ayland:
Andreas Färber wrote:
Directly after we set the MSR_IR|MSR_DR bits in the MSR (arch/ppc/ qemu/ofmem.c:setup_mmu), we get an ISI exception and end up in arch/ ppc/qemu/start.S:vector__0x400 (the 0xfffxxxxx one). We proceed up to the bctrl which should take us to arch/ppc/qemu/ ofmem.c:isi_exception, but then get a 0x700 program exception. The value in ctr looks sensible, it's some 0xfffxxxxx address.
Hmmm this sounds similar to a SPARC32 issue I was finding over the weekend whereby everything died after the MMU was enabled because the context table wasn't correctly aligned. Could it be possible that the MMU hash tables aren't aligned correctly in memory?
Recently I fixed some memory layout calculations and we decided to use the ppc64 alignment even on ppc, so we should be okay. 32-bit ppc code disables the 64-bit mode right away and in every interrupt handler; disabling it in the 64-bit ppc64 interrupt handler gave no improvement though.
I had tried packing the structs used for the page table, without noticable effect:
http://repo.or.cz/w/openbios/afaerber.git/commitdiff/3071a73e8c44779f7bdddcb...
Not sure if that is necessary? It would same safer to me.
Andreas
Am 08.11.2010 um 10:34 schrieb Mark Cave-Ayland:
Andreas Färber wrote:
Since r945 everything except for the trampoline issue should be in SVN.
Cool! :)
Ah, no! Forgot about the exception handlers (preamble, epilogue). :( I have local patches that #ifdef the whole definitions and provide alternative ones for ppc64. But I'd rather supply a 2-step patch to drop the template macros and use #ifdef'ery inside the macro definition for ppc64/ppc differences, given that long-term we won't need the second version.
Andreas
On 08.11.2010, at 22:48, Andreas Färber wrote:
Am 08.11.2010 um 10:34 schrieb Mark Cave-Ayland:
Andreas Färber wrote:
Since r945 everything except for the trampoline issue should be in SVN.
Cool! :)
Ah, no! Forgot about the exception handlers (preamble, epilogue). :( I have local patches that #ifdef the whole definitions and provide alternative ones for ppc64. But I'd rather supply a 2-step patch to drop the template macros and use #ifdef'ery inside the macro definition for ppc64/ppc differences, given that long-term we won't need the second version.
Are you sure you need that? Most of the stuff should be generic. Please take a look at Linux's PPC_LL macro for example. The stack layout varies slightly, but that should also be reasonably easy to catch.
Alex
Am 08.11.2010 um 23:03 schrieb Alexander Graf:
On 08.11.2010, at 22:48, Andreas Färber wrote:
Am 08.11.2010 um 10:34 schrieb Mark Cave-Ayland:
Andreas Färber wrote:
Since r945 everything except for the trampoline issue should be in SVN.
Cool! :)
Ah, no! Forgot about the exception handlers (preamble, epilogue). : ( I have local patches that #ifdef the whole definitions and provide alternative ones for ppc64. But I'd rather supply a 2-step patch to drop the template macros and use #ifdef'ery inside the macro definition for ppc64/ppc differences, given that long-term we won't need the second version.
Are you sure you need that? Most of the stuff should be generic. Please take a look at Linux's PPC_LL macro for example. The stack layout varies slightly, but that should also be reasonably easy to catch.
Problematic parts are the ULONG_SIZE comparisons for the number of registers that we can't #ifdef away inside an CPP macro, plus the m*msr and the trailing/leading stack manipulation (16 in SVR4 vs. 48 in PowerOpen ABI). Also rfid, both here and for the FPU vector.
Note that the other ppc targets don't use assembler macros, just preprocessor macros, so we have kind of a double-macro situation. ;)
Andreas
On 08.11.2010, at 23:18, Andreas Färber wrote:
Am 08.11.2010 um 23:03 schrieb Alexander Graf:
On 08.11.2010, at 22:48, Andreas Färber wrote:
Am 08.11.2010 um 10:34 schrieb Mark Cave-Ayland:
Andreas Färber wrote:
Since r945 everything except for the trampoline issue should be in SVN.
Cool! :)
Ah, no! Forgot about the exception handlers (preamble, epilogue). :( I have local patches that #ifdef the whole definitions and provide alternative ones for ppc64. But I'd rather supply a 2-step patch to drop the template macros and use #ifdef'ery inside the macro definition for ppc64/ppc differences, given that long-term we won't need the second version.
Are you sure you need that? Most of the stuff should be generic. Please take a look at Linux's PPC_LL macro for example. The stack layout varies slightly, but that should also be reasonably easy to catch.
Problematic parts are the ULONG_SIZE comparisons for the number of registers that we can't #ifdef away inside an CPP macro
You're talking about this piece of code, right?
.ifc ULONG_SIZE, 8 ; \ addi r1,r1,-(40 * ULONG_SIZE) ; /* push exception frame */ \ .else ; \ addi r1,r1,-(20 * ULONG_SIZE) ; /* push exception frame */ \ .endif ; \ \
Just replace it with
// PPC32
#define EXCEPTION_STACK_LEN (20 * 4)
// PPC64
#define EXCEPTION_STACK_LEN (40 * 8)
// code
addi r1, r1,-EXCEPTION_STACK_LEN
Btw, are you sure this code is correct? Usually the stack frame includes the old r1 value as well. In __kvmppc_vcpu_entry (arch/powerpc/kvm/book3s_interrupts.S) I have working code that basically does an ABI compliant function in asm:
/* Save host state to the stack */ PPC_STLU r1, -SWITCH_FRAME_SIZE(r1)
, plus the m*msr
mfmsr r1 ; /* unset MSR_SF */ \ clrlwi r1,r1,0 ; \ mtmsr r1 ; \
This one? Yeah, my awesome broken code. Just replace it with:
// PPC32
#define MFMSRD mfmsr #define MTMSRD mtmsr
// PPC64
#define MFMSRD mfmsrd #define MTMSRD mtmsrd
// code
MFMSRD r1 MTMSRD r1
The clearing is not necessary. If it is, qemu or kvm are broken. According to the spec, mfmsr and mtmsr only handle the first 32 bits of MSR.
and the trailing/leading stack manipulation (16 in SVR4 vs. 48 in PowerOpen ABI).
Not sure I understand exactly what you mean here :).
Also rfid, both here and for the FPU vector.
// PPC32
#define RFID rfi
// PPC64
#define RFID rfid
should work, no?
Btw, you should probably define those helpers somewhere generic depending on #ifdef __powerpc64__
Note that the other ppc targets don't use assembler macros, just preprocessor macros, so we have kind of a double-macro situation. ;)
Yes, please don't use assembler macros. If I did that anywhere, I did it wrong :). preprocessor macros are the way to go! :)
Alex
Am 09.11.2010 um 12:53 schrieb Alexander Graf:
On 08.11.2010, at 23:18, Andreas Färber wrote:
Am 08.11.2010 um 23:03 schrieb Alexander Graf:
On 08.11.2010, at 22:48, Andreas Färber wrote:
Am 08.11.2010 um 10:34 schrieb Mark Cave-Ayland:
Andreas Färber wrote:
Since r945 everything except for the trampoline issue should be in SVN.
Cool! :)
Ah, no! Forgot about the exception handlers (preamble, epilogue). :( I have local patches that #ifdef the whole definitions and provide alternative ones for ppc64. But I'd rather supply a 2-step patch to drop the template macros and use #ifdef'ery inside the macro definition for ppc64/ppc differences, given that long-term we won't need the second version.
Are you sure you need that? Most of the stuff should be generic. Please take a look at Linux's PPC_LL macro for example. The stack layout varies slightly, but that should also be reasonably easy to catch.
Problematic parts are the ULONG_SIZE comparisons for the number of registers that we can't #ifdef away inside an CPP macro
You're talking about this piece of code, right?
.ifc ULONG_SIZE, 8 ; \ addi r1,r1,-(40 * ULONG_SIZE) ; /* push exception frame */ \ .else ; \ addi r1,r1,-(20 * ULONG_SIZE) ; /* push exception frame */ \ .endif ; \ \
Just replace it with
// PPC32
#define EXCEPTION_STACK_LEN (20 * 4)
// PPC64
#define EXCEPTION_STACK_LEN (40 * 8)
// code
addi r1, r1,-EXCEPTION_STACK_LEN
No, no, no! This is exactly the type of differences I wanted to rip out, but if I do so now I would break ppc64 support. On __powerpc64__ this doesn't need to be 40, and it's not just the above immediate but the whole .ifc around the 20 additional register stores/loads:
#define EXCEPTION_PREAMBLE_TEMPLATE \ ... stl r11,(11 * ULONG_SIZE)(r1) ; \ stl r12,(12 * ULONG_SIZE)(r1) ; \ .ifc ULONG_SIZE, 8 ; \ stl r13,(17 * ULONG_SIZE)(r1) ; \ stl r14,(18 * ULONG_SIZE)(r1) ; \ ...
.macro EXCEPTION_PREAMBLE EXCEPTION_PREAMBLE_TEMPLATE .endm ... .macro EXCEPTION_PREAMBLE_64 EXCEPTION_PREAMBLE_TEMPLATE .endm
My problem is the reuse of the _TEMPLATE macros. It becomes a matter of macros such as those you suggest if we can get rid of that.
Btw, are you sure this code is correct? Usually the stack frame includes the old r1 value as well. In __kvmppc_vcpu_entry (arch/ powerpc/kvm/book3s_interrupts.S) I have working code that basically does an ABI compliant function in asm:
/* Save host state to the stack */ PPC_STLU r1, -SWITCH_FRAME_SIZE(r1)
Haven't checked. I spotted an instance where two of the three stacks have a potential stack frame overlap though.
, plus the m*msr
mfmsr r1 ; /* unset MSR_SF */ \ clrlwi r1,r1,0 ; \ mtmsr r1 ; \
This one? Yeah, my awesome broken code. Just replace it with:
// PPC32
#define MFMSRD mfmsr #define MTMSRD mtmsr
// PPC64
#define MFMSRD mfmsrd #define MTMSRD mtmsrd
// code
MFMSRD r1 MTMSRD r1
The clearing is not necessary. If it is, qemu or kvm are broken. According to the spec, mfmsr and mtmsr only handle the first 32 bits of MSR.
I've been working without on ppc64, MSR looked okay.
and the trailing/leading stack manipulation (16 in SVR4 vs. 48 in PowerOpen ABI).
Not sure I understand exactly what you mean here :).
addi r1, r1, -x addi r1, r1, x :)
Btw I still don't understand what ABI this is supposed to be. If I can trust Hollis' ppc assembler article and the Mono and QEMU ppc64 ports, then Linux/ppc64 uses function descriptors. But our ppc64 assembler code calling setup_mmu() and entry() works without a function descriptor dereference...
Also rfid, both here and for the FPU vector.
// PPC32
#define RFID rfi
// PPC64
#define RFID rfid
should work, no?
Yeah, if that's the naming convention we want to go with.
Btw, you should probably define those helpers somewhere generic depending on #ifdef __powerpc64__
Note that the other ppc targets don't use assembler macros, just preprocessor macros, so we have kind of a double-macro situation. ;)
Yes, please don't use assembler macros. If I did that anywhere, I did it wrong :). preprocessor macros are the way to go! :)
Both have their advantages imo when used correctly.
Andreas
On 09.11.2010, at 21:01, Andreas Färber wrote:
Am 09.11.2010 um 12:53 schrieb Alexander Graf:
On 08.11.2010, at 23:18, Andreas Färber wrote:
Am 08.11.2010 um 23:03 schrieb Alexander Graf:
On 08.11.2010, at 22:48, Andreas Färber wrote:
Am 08.11.2010 um 10:34 schrieb Mark Cave-Ayland:
Andreas Färber wrote:
> Since r945 everything except for the trampoline issue should be in SVN.
Cool! :)
Ah, no! Forgot about the exception handlers (preamble, epilogue). :( I have local patches that #ifdef the whole definitions and provide alternative ones for ppc64. But I'd rather supply a 2-step patch to drop the template macros and use #ifdef'ery inside the macro definition for ppc64/ppc differences, given that long-term we won't need the second version.
Are you sure you need that? Most of the stuff should be generic. Please take a look at Linux's PPC_LL macro for example. The stack layout varies slightly, but that should also be reasonably easy to catch.
Problematic parts are the ULONG_SIZE comparisons for the number of registers that we can't #ifdef away inside an CPP macro
You're talking about this piece of code, right?
.ifc ULONG_SIZE, 8 ; \ addi r1,r1,-(40 * ULONG_SIZE) ; /* push exception frame */ \ .else ; \ addi r1,r1,-(20 * ULONG_SIZE) ; /* push exception frame */ \ .endif ; \ \
Just replace it with
// PPC32
#define EXCEPTION_STACK_LEN (20 * 4)
// PPC64
#define EXCEPTION_STACK_LEN (40 * 8)
// code
addi r1, r1,-EXCEPTION_STACK_LEN
No, no, no! This is exactly the type of differences I wanted to rip out, but if I do so now I would break ppc64 support. On __powerpc64__ this doesn't need to be 40, and it's not just the above immediate but the whole .ifc around the 20 additional register stores/loads:
#define EXCEPTION_PREAMBLE_TEMPLATE \ ... stl r11,(11 * ULONG_SIZE)(r1) ; \ stl r12,(12 * ULONG_SIZE)(r1) ; \ .ifc ULONG_SIZE, 8 ; \ stl r13,(17 * ULONG_SIZE)(r1) ; \ stl r14,(18 * ULONG_SIZE)(r1) ; \ ...
.macro EXCEPTION_PREAMBLE EXCEPTION_PREAMBLE_TEMPLATE .endm ... .macro EXCEPTION_PREAMBLE_64 EXCEPTION_PREAMBLE_TEMPLATE .endm
My problem is the reuse of the _TEMPLATE macros. It becomes a matter of macros such as those you suggest if we can get rid of that.
Why are there two different preambles? Just always use the 64-bit one for 64-bit openBIOS and 32-bit one for 32-bit openBIOS.
Btw, are you sure this code is correct? Usually the stack frame includes the old r1 value as well. In __kvmppc_vcpu_entry (arch/powerpc/kvm/book3s_interrupts.S) I have working code that basically does an ABI compliant function in asm:
/* Save host state to the stack */ PPC_STLU r1, -SWITCH_FRAME_SIZE(r1)
Haven't checked. I spotted an instance where two of the three stacks have a potential stack frame overlap though.
, plus the m*msr
mfmsr r1 ; /* unset MSR_SF */ \ clrlwi r1,r1,0 ; \ mtmsr r1 ; \
This one? Yeah, my awesome broken code. Just replace it with:
// PPC32
#define MFMSRD mfmsr #define MTMSRD mtmsr
// PPC64
#define MFMSRD mfmsrd #define MTMSRD mtmsrd
// code
MFMSRD r1 MTMSRD r1
The clearing is not necessary. If it is, qemu or kvm are broken. According to the spec, mfmsr and mtmsr only handle the first 32 bits of MSR.
I've been working without on ppc64, MSR looked okay.
and the trailing/leading stack manipulation (16 in SVR4 vs. 48 in PowerOpen ABI).
Not sure I understand exactly what you mean here :).
addi r1, r1, -x addi r1, r1, x :)
Btw I still don't understand what ABI this is supposed to be. If I can trust Hollis' ppc assembler article and the Mono and QEMU ppc64 ports, then Linux/ppc64 uses function descriptors. But our ppc64 assembler code calling setup_mmu() and entry() works without a function descriptor dereference...
Uh, sounds like a question to Segher :).
Alex
Am 09.11.2010 um 21:24 schrieb Alexander Graf:
On 09.11.2010, at 21:01, Andreas Färber wrote:
Am 09.11.2010 um 12:53 schrieb Alexander Graf:
On 08.11.2010, at 23:18, Andreas Färber wrote:
Am 08.11.2010 um 23:03 schrieb Alexander Graf:
On 08.11.2010, at 22:48, Andreas Färber wrote:
Am 08.11.2010 um 10:34 schrieb Mark Cave-Ayland:
> Andreas Färber wrote: > >> Since r945 everything except for the trampoline issue should >> be in SVN. > > Cool! :)
Ah, no! Forgot about the exception handlers (preamble, epilogue). :( I have local patches that #ifdef the whole definitions and provide alternative ones for ppc64. But I'd rather supply a 2-step patch to drop the template macros and use #ifdef'ery inside the macro definition for ppc64/ppc differences, given that long-term we won't need the second version.
Are you sure you need that? Most of the stuff should be generic. Please take a look at Linux's PPC_LL macro for example. The stack layout varies slightly, but that should also be reasonably easy to catch.
Problematic parts are the ULONG_SIZE comparisons for the number of registers that we can't #ifdef away inside an CPP macro
You're talking about this piece of code, right?
.ifc ULONG_SIZE, 8 ; \ addi r1,r1,-(40 * ULONG_SIZE) ; /* push exception frame */ \ .else ; \ addi r1,r1,-(20 * ULONG_SIZE) ; /* push exception frame */ \ .endif ; \ \
Just replace it with
// PPC32
#define EXCEPTION_STACK_LEN (20 * 4)
// PPC64
#define EXCEPTION_STACK_LEN (40 * 8)
// code
addi r1, r1,-EXCEPTION_STACK_LEN
No, no, no! This is exactly the type of differences I wanted to rip out, but if I do so now I would break ppc64 support. On __powerpc64__ this doesn't need to be 40, and it's not just the above immediate but the whole .ifc around the 20 additional register stores/loads:
#define EXCEPTION_PREAMBLE_TEMPLATE \ ... stl r11,(11 * ULONG_SIZE)(r1) ; \ stl r12,(12 * ULONG_SIZE)(r1) ; \ .ifc ULONG_SIZE, 8 ; \ stl r13,(17 * ULONG_SIZE)(r1) ; \ stl r14,(18 * ULONG_SIZE)(r1) ; \ ...
.macro EXCEPTION_PREAMBLE EXCEPTION_PREAMBLE_TEMPLATE .endm ... .macro EXCEPTION_PREAMBLE_64 EXCEPTION_PREAMBLE_TEMPLATE .endm
My problem is the reuse of the _TEMPLATE macros. It becomes a matter of macros such as those you suggest if we can get rid of that.
Why are there two different preambles? Just always use the 64-bit one for 64-bit openBIOS and 32-bit one for 32-bit openBIOS.
Quoting myself: "I would break ppc64 support"
obj-ppc64/openbios-qemu.elf: EXCEPTION_PREAMBLE => optionally used for qemu-system-ppc64, not yet working
obj-ppc/openbios-qemu.elf: EXCEPTION_PREAMBLE => used for qemu-system-ppc EXCEPTION_PREAMBLE_64 => currently used for qemu-system-ppc64 by patching the exception vectors based on PVR-determined CPU init function
Thus, if we drop the latter before ppc64 is up and running we would no longer have a working openbios-ppc for qemu-system-ppc64.
Andreas
On 09.11.2010, at 22:34, Andreas Färber wrote:
Am 09.11.2010 um 21:24 schrieb Alexander Graf:
On 09.11.2010, at 21:01, Andreas Färber wrote:
Am 09.11.2010 um 12:53 schrieb Alexander Graf:
On 08.11.2010, at 23:18, Andreas Färber wrote:
Am 08.11.2010 um 23:03 schrieb Alexander Graf:
On 08.11.2010, at 22:48, Andreas Färber wrote:
> Am 08.11.2010 um 10:34 schrieb Mark Cave-Ayland: > >> Andreas Färber wrote: >> >>> Since r945 everything except for the trampoline issue should be in SVN. >> >> Cool! :) > > Ah, no! Forgot about the exception handlers (preamble, epilogue). :( I have local patches that #ifdef the whole definitions and provide alternative ones for ppc64. But I'd rather supply a 2-step patch to drop the template macros and use #ifdef'ery inside the macro definition for ppc64/ppc differences, given that long-term we won't need the second version.
Are you sure you need that? Most of the stuff should be generic. Please take a look at Linux's PPC_LL macro for example. The stack layout varies slightly, but that should also be reasonably easy to catch.
Problematic parts are the ULONG_SIZE comparisons for the number of registers that we can't #ifdef away inside an CPP macro
You're talking about this piece of code, right?
.ifc ULONG_SIZE, 8 ; \ addi r1,r1,-(40 * ULONG_SIZE) ; /* push exception frame */ \ .else ; \ addi r1,r1,-(20 * ULONG_SIZE) ; /* push exception frame */ \ .endif ; \ \
Just replace it with
// PPC32
#define EXCEPTION_STACK_LEN (20 * 4)
// PPC64
#define EXCEPTION_STACK_LEN (40 * 8)
// code
addi r1, r1,-EXCEPTION_STACK_LEN
No, no, no! This is exactly the type of differences I wanted to rip out, but if I do so now I would break ppc64 support. On __powerpc64__ this doesn't need to be 40, and it's not just the above immediate but the whole .ifc around the 20 additional register stores/loads:
#define EXCEPTION_PREAMBLE_TEMPLATE \ ... stl r11,(11 * ULONG_SIZE)(r1) ; \ stl r12,(12 * ULONG_SIZE)(r1) ; \ .ifc ULONG_SIZE, 8 ; \ stl r13,(17 * ULONG_SIZE)(r1) ; \ stl r14,(18 * ULONG_SIZE)(r1) ; \ ...
.macro EXCEPTION_PREAMBLE EXCEPTION_PREAMBLE_TEMPLATE .endm ... .macro EXCEPTION_PREAMBLE_64 EXCEPTION_PREAMBLE_TEMPLATE .endm
My problem is the reuse of the _TEMPLATE macros. It becomes a matter of macros such as those you suggest if we can get rid of that.
Why are there two different preambles? Just always use the 64-bit one for 64-bit openBIOS and 32-bit one for 32-bit openBIOS.
Quoting myself: "I would break ppc64 support"
obj-ppc64/openbios-qemu.elf: EXCEPTION_PREAMBLE => optionally used for qemu-system-ppc64, not yet working
obj-ppc/openbios-qemu.elf: EXCEPTION_PREAMBLE => used for qemu-system-ppc EXCEPTION_PREAMBLE_64 => currently used for qemu-system-ppc64 by patching the exception vectors based on PVR-determined CPU init function
Thus, if we drop the latter before ppc64 is up and running we would no longer have a working openbios-ppc for qemu-system-ppc64.
I still don't see why. We can just use a ppc32 compiled version for working ppc64 support, no? The only reason I put us back to 32bit mode was because otherwise lis -1 would end up being -1 instead of 0xffff0000 :).
So why don't you just keep the current code working, duplicate it for the time being for ppc64 and then once everything works merge it back together?
Alex
Am 09.11.2010 um 22:48 schrieb Alexander Graf:
On 09.11.2010, at 22:34, Andreas Färber wrote:
Am 09.11.2010 um 21:24 schrieb Alexander Graf:
On 09.11.2010, at 21:01, Andreas Färber wrote:
Am 09.11.2010 um 12:53 schrieb Alexander Graf:
On 08.11.2010, at 23:18, Andreas Färber wrote:
Am 08.11.2010 um 23:03 schrieb Alexander Graf:
> On 08.11.2010, at 22:48, Andreas Färber wrote: > >> Am 08.11.2010 um 10:34 schrieb Mark Cave-Ayland: >> >>> Andreas Färber wrote: >>> >>>> Since r945 everything except for the trampoline issue >>>> should be in SVN. >>> >>> Cool! :) >> >> Ah, no! Forgot about the exception handlers (preamble, >> epilogue). :( I have local patches that #ifdef the whole >> definitions and provide alternative ones for ppc64. But I'd >> rather supply a 2-step patch to drop the template macros and >> use #ifdef'ery inside the macro definition for ppc64/ppc >> differences, given that long-term we won't need the second >> version. > > Are you sure you need that? Most of the stuff should be > generic. Please take a look at Linux's PPC_LL macro for > example. The stack layout varies slightly, but that should > also be reasonably easy to catch.
Problematic parts are the ULONG_SIZE comparisons for the number of registers that we can't #ifdef away inside an CPP macro
You're talking about this piece of code, right?
.ifc ULONG_SIZE, 8 ; \ addi r1,r1,-(40 * ULONG_SIZE) ; /* push exception frame */ \ .else ; \ addi r1,r1,-(20 * ULONG_SIZE) ; /* push exception frame */ \ .endif ; \ \
Just replace it with
// PPC32
#define EXCEPTION_STACK_LEN (20 * 4)
// PPC64
#define EXCEPTION_STACK_LEN (40 * 8)
// code
addi r1, r1,-EXCEPTION_STACK_LEN
No, no, no! This is exactly the type of differences I wanted to rip out, but if I do so now I would break ppc64 support. On __powerpc64__ this doesn't need to be 40, and it's not just the above immediate but the whole .ifc around the 20 additional register stores/loads:
#define EXCEPTION_PREAMBLE_TEMPLATE \ ... stl r11,(11 * ULONG_SIZE)(r1) ; \ stl r12,(12 * ULONG_SIZE)(r1) ; \ .ifc ULONG_SIZE, 8 ; \ stl r13,(17 * ULONG_SIZE)(r1) ; \ stl r14,(18 * ULONG_SIZE)(r1) ; \ ...
.macro EXCEPTION_PREAMBLE EXCEPTION_PREAMBLE_TEMPLATE .endm ... .macro EXCEPTION_PREAMBLE_64 EXCEPTION_PREAMBLE_TEMPLATE .endm
My problem is the reuse of the _TEMPLATE macros. It becomes a matter of macros such as those you suggest if we can get rid of that.
Why are there two different preambles? Just always use the 64-bit one for 64-bit openBIOS and 32-bit one for 32-bit openBIOS.
Quoting myself: "I would break ppc64 support"
obj-ppc64/openbios-qemu.elf: EXCEPTION_PREAMBLE => optionally used for qemu-system-ppc64, not yet working
obj-ppc/openbios-qemu.elf: EXCEPTION_PREAMBLE => used for qemu-system-ppc EXCEPTION_PREAMBLE_64 => currently used for qemu-system-ppc64 by patching the exception vectors based on PVR-determined CPU init function
Thus, if we drop the latter before ppc64 is up and running we would no longer have a working openbios-ppc for qemu-system-ppc64.
I still don't see why. We can just use a ppc32 compiled version for working ppc64 support, no? The only reason I put us back to 32bit mode was because otherwise lis -1 would end up being -1 instead of 0xffff0000 :).
So why don't you just keep the current code working, duplicate it for the time being for ppc64 and then once everything works merge it back together?
Pretty sure I mentioned that alternative already... :)
http://repo.or.cz/w/openbios/afaerber.git/blobdiff/97003b1ff3860847ce9ad5c6b...
Tried to avoid that for SVN. But if that's consensus, it sure is the easiest for me!
Andreas
On 09.11.2010, at 22:58, Andreas Färber wrote:
Am 09.11.2010 um 22:48 schrieb Alexander Graf:
On 09.11.2010, at 22:34, Andreas Färber wrote:
Am 09.11.2010 um 21:24 schrieb Alexander Graf:
On 09.11.2010, at 21:01, Andreas Färber wrote:
Am 09.11.2010 um 12:53 schrieb Alexander Graf:
On 08.11.2010, at 23:18, Andreas Färber wrote:
> Am 08.11.2010 um 23:03 schrieb Alexander Graf: > >> On 08.11.2010, at 22:48, Andreas Färber wrote: >> >>> Am 08.11.2010 um 10:34 schrieb Mark Cave-Ayland: >>> >>>> Andreas Färber wrote: >>>> >>>>> Since r945 everything except for the trampoline issue should be in SVN. >>>> >>>> Cool! :) >>> >>> Ah, no! Forgot about the exception handlers (preamble, epilogue). :( I have local patches that #ifdef the whole definitions and provide alternative ones for ppc64. But I'd rather supply a 2-step patch to drop the template macros and use #ifdef'ery inside the macro definition for ppc64/ppc differences, given that long-term we won't need the second version. >> >> Are you sure you need that? Most of the stuff should be generic. Please take a look at Linux's PPC_LL macro for example. The stack layout varies slightly, but that should also be reasonably easy to catch. > > Problematic parts are the ULONG_SIZE comparisons for the number of registers that we can't #ifdef away inside an CPP macro
You're talking about this piece of code, right?
.ifc ULONG_SIZE, 8 ; \ addi r1,r1,-(40 * ULONG_SIZE) ; /* push exception frame */ \ .else ; \ addi r1,r1,-(20 * ULONG_SIZE) ; /* push exception frame */ \ .endif ; \ \
Just replace it with
// PPC32
#define EXCEPTION_STACK_LEN (20 * 4)
// PPC64
#define EXCEPTION_STACK_LEN (40 * 8)
// code
addi r1, r1,-EXCEPTION_STACK_LEN
No, no, no! This is exactly the type of differences I wanted to rip out, but if I do so now I would break ppc64 support. On __powerpc64__ this doesn't need to be 40, and it's not just the above immediate but the whole .ifc around the 20 additional register stores/loads:
#define EXCEPTION_PREAMBLE_TEMPLATE \ ... stl r11,(11 * ULONG_SIZE)(r1) ; \ stl r12,(12 * ULONG_SIZE)(r1) ; \ .ifc ULONG_SIZE, 8 ; \ stl r13,(17 * ULONG_SIZE)(r1) ; \ stl r14,(18 * ULONG_SIZE)(r1) ; \ ...
.macro EXCEPTION_PREAMBLE EXCEPTION_PREAMBLE_TEMPLATE .endm ... .macro EXCEPTION_PREAMBLE_64 EXCEPTION_PREAMBLE_TEMPLATE .endm
My problem is the reuse of the _TEMPLATE macros. It becomes a matter of macros such as those you suggest if we can get rid of that.
Why are there two different preambles? Just always use the 64-bit one for 64-bit openBIOS and 32-bit one for 32-bit openBIOS.
Quoting myself: "I would break ppc64 support"
obj-ppc64/openbios-qemu.elf: EXCEPTION_PREAMBLE => optionally used for qemu-system-ppc64, not yet working
obj-ppc/openbios-qemu.elf: EXCEPTION_PREAMBLE => used for qemu-system-ppc EXCEPTION_PREAMBLE_64 => currently used for qemu-system-ppc64 by patching the exception vectors based on PVR-determined CPU init function
Thus, if we drop the latter before ppc64 is up and running we would no longer have a working openbios-ppc for qemu-system-ppc64.
I still don't see why. We can just use a ppc32 compiled version for working ppc64 support, no? The only reason I put us back to 32bit mode was because otherwise lis -1 would end up being -1 instead of 0xffff0000 :).
So why don't you just keep the current code working, duplicate it for the time being for ppc64 and then once everything works merge it back together?
Pretty sure I mentioned that alternative already... :)
http://repo.or.cz/w/openbios/afaerber.git/blobdiff/97003b1ff3860847ce9ad5c6b...
Tried to avoid that for SVN. But if that's consensus, it sure is the easiest for me!
I don't see any better way really. It's a lot easier to worry about those pieces when the rest is good. As long as you promise to clean up the mess once you're ready of course.
So apart from the rfi change which should really be done using an RFI define, that change looks good.
Alex
Having 64-bit support as an option allows users to disable and test it before it gets officially removed.
Fork single-bitness macros EXCEPTION_{PREAMBLE,EPILOGUE}: Use assembler macros for things that are constant to avoid ; and , and use preprocessor macros to handle differences. Adopt QEMU coding style for new code.
Functional changes for ppc64: * Don't clear MSR in preamble. * Just save the minimum number of registers since 64-bit code will save the full registers. * Reserve 48 bytes of stack frame space for ppc64, according to 64-bit PowerPC ELF ABI supplement 1.9. * Use RFI macro.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de --- arch/ppc/qemu/init.c | 4 ++ arch/ppc/qemu/start.S | 92 +++++++++++++++++++++++++++++++++++++++- config/examples/ppc_config.xml | 1 + 3 files changed, 96 insertions(+), 1 deletions(-)
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index 0b781d9..d17c843 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -303,6 +303,7 @@ cpu_g4_init(const struct cpudef *cpu) fword("finish-device"); }
+#ifdef CONFIG_PPC_64BITSUPPORT /* In order to get 64 bit aware handlers that rescue all our GPRs from getting truncated to 32 bits, we need to patch the existing handlers so they jump to our 64 bit aware ones. */ @@ -322,6 +323,7 @@ ppc64_patch_handlers(void) asm ( "icbi 0, %0" : : "r"(dsi) ); asm ( "icbi 0, %0" : : "r"(isi) ); } +#endif
static void cpu_970_init(const struct cpudef *cpu) @@ -341,10 +343,12 @@ cpu_970_init(const struct cpudef *cpu)
fword("finish-device");
+#ifdef CONFIG_PPC_64BITSUPPORT /* The 970 is a PPC64 CPU, so we need to activate * 64bit aware interrupt handlers */
ppc64_patch_handlers(); +#endif
/* The 970 also implements the HIOR which we need to set to 0 */
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index e86bdfd..6cf20cf 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -14,6 +14,7 @@ * */
+#include "autoconf.h" #include "asm/asmdefs.h" #include "asm/processor.h"
@@ -24,6 +25,8 @@ #define ILLEGAL_VECTOR( v ) .org __vectors + v ; vector__##v: bl trap_error ; #define VECTOR( v, dummystr ) .org __vectors + v ; vector__##v
+#ifdef CONFIG_PPC_64BITSUPPORT + /* We're trying to use the same code for the ppc32 and ppc64 handlers here. * On ppc32 we only save/restore the registers, C considers volatile. * @@ -176,6 +179,89 @@ #undef stl #undef ll
+#else + +#ifdef __powerpc64__ + +#define ULONG_SIZE 8 +#define STACKFRAME_MINSIZE 48 +#define stl std +#define ll ld + +#else + +#define ULONG_SIZE 4 +#define STACKFRAME_MINSIZE 16 +#define stl stw +#define ll lwz + +#endif + +.macro EXCEPTION_PREAMBLE + mtsprg1 r1 /* scratch */ + mfsprg0 r1 /* exception stack in sprg0 */ + addi r1, r1, -(20 * ULONG_SIZE) /* push exception frame */ + + stl r0, ( 0 * ULONG_SIZE)(r1) /* save r0 */ + mfsprg1 r0 + stl r0, ( 1 * ULONG_SIZE)(r1) /* save r1 */ + stl r2, ( 2 * ULONG_SIZE)(r1) /* save r2 */ + stl r3, ( 3 * ULONG_SIZE)(r1) /* save r3 */ + stl r4, ( 4 * ULONG_SIZE)(r1) + stl r5, ( 5 * ULONG_SIZE)(r1) + stl r6, ( 6 * ULONG_SIZE)(r1) + stl r7, ( 7 * ULONG_SIZE)(r1) + stl r8, ( 8 * ULONG_SIZE)(r1) + stl r9, ( 9 * ULONG_SIZE)(r1) + stl r10, (10 * ULONG_SIZE)(r1) + stl r11, (11 * ULONG_SIZE)(r1) + stl r12, (12 * ULONG_SIZE)(r1) + + mflr r0 + stl r0, (13 * ULONG_SIZE)(r1) + mfcr r0 + stl r0, (14 * ULONG_SIZE)(r1) + mfctr r0 + stl r0, (15 * ULONG_SIZE)(r1) + mfxer r0 + stl r0, (16 * ULONG_SIZE)(r1) + + /* 76(r1) unused */ + + addi r1, r1, -STACKFRAME_MINSIZE /* C ABI saves LR and SP */ +.endm + +.macro EXCEPTION_EPILOGUE + addi r1, r1, STACKFRAME_MINSIZE /* pop ABI frame */ + + ll r0, (13 * ULONG_SIZE)(r1) + mtlr r0 + ll r0, (14 * ULONG_SIZE)(r1) + mtcr r0 + ll r0, (15 * ULONG_SIZE)(r1) + mtctr r0 + ll r0, (16 * ULONG_SIZE)(r1) + mtxer r0 + + ll r0, ( 0 * ULONG_SIZE)(r1) + ll r2, ( 2 * ULONG_SIZE)(r1) + ll r3, ( 3 * ULONG_SIZE)(r1) + ll r4, ( 4 * ULONG_SIZE)(r1) + ll r5, ( 5 * ULONG_SIZE)(r1) + ll r6, ( 6 * ULONG_SIZE)(r1) + ll r7, ( 7 * ULONG_SIZE)(r1) + ll r8, ( 8 * ULONG_SIZE)(r1) + ll r9, ( 9 * ULONG_SIZE)(r1) + ll r10, (10 * ULONG_SIZE)(r1) + ll r11, (11 * ULONG_SIZE)(r1) + ll r12, (12 * ULONG_SIZE)(r1) + + ll r1, ( 1 * ULONG_SIZE)(r1) /* restore stack at last */ + RFI +.endm + +#endif + /************************************************************************/ /* vectors */ /************************************************************************/ @@ -253,6 +339,8 @@ ILLEGAL_VECTOR( 0x1500 ) ILLEGAL_VECTOR( 0x1600 ) ILLEGAL_VECTOR( 0x1700 )
+#ifdef CONFIG_PPC_64BITSUPPORT + VECTOR( 0x2000, "DSI_64" ): EXCEPTION_PREAMBLE_64 LOAD_REG_IMMEDIATE(r3, dsi_exception) @@ -267,6 +355,8 @@ VECTOR( 0x2200, "ISI_64" ): bctrl EXCEPTION_EPILOGUE_64
+#endif + GLOBL(__vectors_end):
/************************************************************************/ @@ -275,7 +365,7 @@ GLOBL(__vectors_end):
GLOBL(_entry):
-#ifndef __powerpc64__ +#ifdef CONFIG_PPC_64BITSUPPORT /* clear MSR, disable MMU */
li r0,0 diff --git a/config/examples/ppc_config.xml b/config/examples/ppc_config.xml index 5f79c21..352cb57 100644 --- a/config/examples/ppc_config.xml +++ b/config/examples/ppc_config.xml @@ -57,6 +57,7 @@ <option name="CONFIG_DEBUG_FS" type="boolean" value="false"/>
<!-- Miscellaneous --> + <option name="CONFIG_PPC_64BITSUPPORT" type="boolean" value="true"/> <option name="CONFIG_LINUXBIOS" type="boolean" value="false"/> <option name="CONFIG_RTAS" type="boolean" value="false"/>
On 14.11.2010, at 02:48, Andreas Färber wrote:
Having 64-bit support as an option allows users to disable and test it before it gets officially removed.
Fork single-bitness macros EXCEPTION_{PREAMBLE,EPILOGUE}: Use assembler macros for things that are constant to avoid ; and , and use preprocessor macros to handle differences. Adopt QEMU coding style for new code.
Functional changes for ppc64:
- Don't clear MSR in preamble.
- Just save the minimum number of registers since 64-bit code will
save the full registers.
- Reserve 48 bytes of stack frame space for ppc64, according to
64-bit PowerPC ELF ABI supplement 1.9.
- Use RFI macro.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/init.c | 4 ++ arch/ppc/qemu/start.S | 92 +++++++++++++++++++++++++++++++++++++++- config/examples/ppc_config.xml | 1 + 3 files changed, 96 insertions(+), 1 deletions(-)
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index 0b781d9..d17c843 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -303,6 +303,7 @@ cpu_g4_init(const struct cpudef *cpu) fword("finish-device"); }
+#ifdef CONFIG_PPC_64BITSUPPORT /* In order to get 64 bit aware handlers that rescue all our GPRs from getting truncated to 32 bits, we need to patch the existing handlers so they jump to our 64 bit aware ones. */ @@ -322,6 +323,7 @@ ppc64_patch_handlers(void) asm ( "icbi 0, %0" : : "r"(dsi) ); asm ( "icbi 0, %0" : : "r"(isi) ); } +#endif
static void cpu_970_init(const struct cpudef *cpu) @@ -341,10 +343,12 @@ cpu_970_init(const struct cpudef *cpu)
fword("finish-device");
+#ifdef CONFIG_PPC_64BITSUPPORT /* The 970 is a PPC64 CPU, so we need to activate * 64bit aware interrupt handlers */
ppc64_patch_handlers();
+#endif
/* The 970 also implements the HIOR which we need to set to 0 */
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index e86bdfd..6cf20cf 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -14,6 +14,7 @@
*/
+#include "autoconf.h" #include "asm/asmdefs.h" #include "asm/processor.h"
@@ -24,6 +25,8 @@ #define ILLEGAL_VECTOR( v ) .org __vectors + v ; vector__##v: bl trap_error ; #define VECTOR( v, dummystr ) .org __vectors + v ; vector__##v
+#ifdef CONFIG_PPC_64BITSUPPORT
/* We're trying to use the same code for the ppc32 and ppc64 handlers here.
- On ppc32 we only save/restore the registers, C considers volatile.
@@ -176,6 +179,89 @@ #undef stl #undef ll
+#else
+#ifdef __powerpc64__
+#define ULONG_SIZE 8 +#define STACKFRAME_MINSIZE 48 +#define stl std +#define ll ld
+#else
+#define ULONG_SIZE 4 +#define STACKFRAME_MINSIZE 16 +#define stl stw +#define ll lwz
+#endif
+.macro EXCEPTION_PREAMBLE
- mtsprg1 r1 /* scratch */
- mfsprg0 r1 /* exception stack in sprg0 */
- addi r1, r1, -(20 * ULONG_SIZE) /* push exception frame */
- stl r0, ( 0 * ULONG_SIZE)(r1) /* save r0 */
- mfsprg1 r0
- stl r0, ( 1 * ULONG_SIZE)(r1) /* save r1 */
- stl r2, ( 2 * ULONG_SIZE)(r1) /* save r2 */
- stl r3, ( 3 * ULONG_SIZE)(r1) /* save r3 */
- stl r4, ( 4 * ULONG_SIZE)(r1)
- stl r5, ( 5 * ULONG_SIZE)(r1)
- stl r6, ( 6 * ULONG_SIZE)(r1)
- stl r7, ( 7 * ULONG_SIZE)(r1)
- stl r8, ( 8 * ULONG_SIZE)(r1)
- stl r9, ( 9 * ULONG_SIZE)(r1)
- stl r10, (10 * ULONG_SIZE)(r1)
- stl r11, (11 * ULONG_SIZE)(r1)
- stl r12, (12 * ULONG_SIZE)(r1)
- mflr r0
- stl r0, (13 * ULONG_SIZE)(r1)
- mfcr r0
- stl r0, (14 * ULONG_SIZE)(r1)
- mfctr r0
- stl r0, (15 * ULONG_SIZE)(r1)
- mfxer r0
- stl r0, (16 * ULONG_SIZE)(r1)
- /* 76(r1) unused */
- addi r1, r1, -STACKFRAME_MINSIZE /* C ABI saves LR and SP */
+.endm
+.macro EXCEPTION_EPILOGUE
- addi r1, r1, STACKFRAME_MINSIZE /* pop ABI frame */
- ll r0, (13 * ULONG_SIZE)(r1)
- mtlr r0
- ll r0, (14 * ULONG_SIZE)(r1)
- mtcr r0
- ll r0, (15 * ULONG_SIZE)(r1)
- mtctr r0
- ll r0, (16 * ULONG_SIZE)(r1)
- mtxer r0
- ll r0, ( 0 * ULONG_SIZE)(r1)
- ll r2, ( 2 * ULONG_SIZE)(r1)
- ll r3, ( 3 * ULONG_SIZE)(r1)
- ll r4, ( 4 * ULONG_SIZE)(r1)
- ll r5, ( 5 * ULONG_SIZE)(r1)
- ll r6, ( 6 * ULONG_SIZE)(r1)
- ll r7, ( 7 * ULONG_SIZE)(r1)
- ll r8, ( 8 * ULONG_SIZE)(r1)
- ll r9, ( 9 * ULONG_SIZE)(r1)
- ll r10, (10 * ULONG_SIZE)(r1)
- ll r11, (11 * ULONG_SIZE)(r1)
- ll r12, (12 * ULONG_SIZE)(r1)
- ll r1, ( 1 * ULONG_SIZE)(r1) /* restore stack at last */
- RFI
+.endm
I don't think this really belongs in this patch, no? :)
Alex
Am 17.11.2010 um 02:32 schrieb Alexander Graf:
On 14.11.2010, at 02:48, Andreas Färber wrote:
Having 64-bit support as an option allows users to disable and test it before it gets officially removed.
Fork single-bitness macros EXCEPTION_{PREAMBLE,EPILOGUE}: Use assembler macros for things that are constant to avoid ; and , and use preprocessor macros to handle differences. Adopt QEMU coding style for new code.
Functional changes for ppc64:
- Don't clear MSR in preamble.
- Just save the minimum number of registers since 64-bit code will
save the full registers.
- Reserve 48 bytes of stack frame space for ppc64, according to
64-bit PowerPC ELF ABI supplement 1.9.
- Use RFI macro.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/init.c | 4 ++ arch/ppc/qemu/start.S | 92 ++++++++++++++++++++++++++++++ +++++++++- config/examples/ppc_config.xml | 1 + 3 files changed, 96 insertions(+), 1 deletions(-)
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index 0b781d9..d17c843 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -303,6 +303,7 @@ cpu_g4_init(const struct cpudef *cpu) fword("finish-device"); }
+#ifdef CONFIG_PPC_64BITSUPPORT /* In order to get 64 bit aware handlers that rescue all our GPRs from getting truncated to 32 bits, we need to patch the existing handlers so they jump to our 64 bit aware ones. */ @@ -322,6 +323,7 @@ ppc64_patch_handlers(void) asm ( "icbi 0, %0" : : "r"(dsi) ); asm ( "icbi 0, %0" : : "r"(isi) ); } +#endif
static void cpu_970_init(const struct cpudef *cpu) @@ -341,10 +343,12 @@ cpu_970_init(const struct cpudef *cpu)
fword("finish-device");
+#ifdef CONFIG_PPC_64BITSUPPORT /* The 970 is a PPC64 CPU, so we need to activate * 64bit aware interrupt handlers */
ppc64_patch_handlers(); +#endif
/* The 970 also implements the HIOR which we need to set to 0 */
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index e86bdfd..6cf20cf 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -14,6 +14,7 @@
*/
+#include "autoconf.h" #include "asm/asmdefs.h" #include "asm/processor.h"
@@ -24,6 +25,8 @@ #define ILLEGAL_VECTOR( v ) .org __vectors + v ; vector__##v: bl trap_error ; #define VECTOR( v, dummystr ) .org __vectors + v ; vector__##v
+#ifdef CONFIG_PPC_64BITSUPPORT
/* We're trying to use the same code for the ppc32 and ppc64 handlers here.
- On ppc32 we only save/restore the registers, C considers volatile.
@@ -176,6 +179,89 @@ #undef stl #undef ll
+#else
+#ifdef __powerpc64__
+#define ULONG_SIZE 8 +#define STACKFRAME_MINSIZE 48 +#define stl std +#define ll ld
+#else
+#define ULONG_SIZE 4 +#define STACKFRAME_MINSIZE 16 +#define stl stw +#define ll lwz
+#endif
+.macro EXCEPTION_PREAMBLE
- mtsprg1 r1 /* scratch */
- mfsprg0 r1 /* exception stack in sprg0 */
- addi r1, r1, -(20 * ULONG_SIZE) /* push exception frame */
- stl r0, ( 0 * ULONG_SIZE)(r1) /* save r0 */
- mfsprg1 r0
- stl r0, ( 1 * ULONG_SIZE)(r1) /* save r1 */
- stl r2, ( 2 * ULONG_SIZE)(r1) /* save r2 */
- stl r3, ( 3 * ULONG_SIZE)(r1) /* save r3 */
- stl r4, ( 4 * ULONG_SIZE)(r1)
- stl r5, ( 5 * ULONG_SIZE)(r1)
- stl r6, ( 6 * ULONG_SIZE)(r1)
- stl r7, ( 7 * ULONG_SIZE)(r1)
- stl r8, ( 8 * ULONG_SIZE)(r1)
- stl r9, ( 9 * ULONG_SIZE)(r1)
- stl r10, (10 * ULONG_SIZE)(r1)
- stl r11, (11 * ULONG_SIZE)(r1)
- stl r12, (12 * ULONG_SIZE)(r1)
- mflr r0
- stl r0, (13 * ULONG_SIZE)(r1)
- mfcr r0
- stl r0, (14 * ULONG_SIZE)(r1)
- mfctr r0
- stl r0, (15 * ULONG_SIZE)(r1)
- mfxer r0
- stl r0, (16 * ULONG_SIZE)(r1)
- /* 76(r1) unused */
This comment is obviously outdated, just like in the old code path.
- addi r1, r1, -STACKFRAME_MINSIZE /* C ABI saves LR and SP */
+.endm
+.macro EXCEPTION_EPILOGUE
- addi r1, r1, STACKFRAME_MINSIZE /* pop ABI frame */
- ll r0, (13 * ULONG_SIZE)(r1)
- mtlr r0
- ll r0, (14 * ULONG_SIZE)(r1)
- mtcr r0
- ll r0, (15 * ULONG_SIZE)(r1)
- mtctr r0
- ll r0, (16 * ULONG_SIZE)(r1)
- mtxer r0
- ll r0, ( 0 * ULONG_SIZE)(r1)
- ll r2, ( 2 * ULONG_SIZE)(r1)
- ll r3, ( 3 * ULONG_SIZE)(r1)
- ll r4, ( 4 * ULONG_SIZE)(r1)
- ll r5, ( 5 * ULONG_SIZE)(r1)
- ll r6, ( 6 * ULONG_SIZE)(r1)
- ll r7, ( 7 * ULONG_SIZE)(r1)
- ll r8, ( 8 * ULONG_SIZE)(r1)
- ll r9, ( 9 * ULONG_SIZE)(r1)
- ll r10, (10 * ULONG_SIZE)(r1)
- ll r11, (11 * ULONG_SIZE)(r1)
- ll r12, (12 * ULONG_SIZE)(r1)
- ll r1, ( 1 * ULONG_SIZE)(r1) /* restore stack at last */
- RFI
+.endm
I don't think this really belongs in this patch, no? :)
--verbose please. The description is pretty clear on what this patch does and why.
RFI was committed in r955, so can be used here.
It doesn't make sense to me to #ifdef out code without providing the equivalent code path for ppc64, so this new code path does belong here (remember you agreed that having two parallel code paths was supposedly the only way for migration?).
If you're referring to the suggested use of a .macro, please discuss this with Blue in general and not some hundred lines down a patch without referring to the commit message where this is discussed. Blue has been asking me to turn preprocessor macros into C inline functions - this construct here seems the closest equivalent in assembler code and the advantages I see are detailed in the commit message. You have not yet voiced any particular reason to make this a preprocessor macro, other than your personal preference for preprocessor macros, which appears to conflict with Blue's dislike of preprocessor macros and my dislike of unnecessary multi-line preprocessor macros.
Andreas
On 17.11.2010, at 20:45, Andreas Färber wrote:
Am 17.11.2010 um 02:32 schrieb Alexander Graf:
On 14.11.2010, at 02:48, Andreas Färber wrote:
Having 64-bit support as an option allows users to disable and test it before it gets officially removed.
Fork single-bitness macros EXCEPTION_{PREAMBLE,EPILOGUE}: Use assembler macros for things that are constant to avoid ; and , and use preprocessor macros to handle differences. Adopt QEMU coding style for new code.
Functional changes for ppc64:
- Don't clear MSR in preamble.
- Just save the minimum number of registers since 64-bit code will
save the full registers.
- Reserve 48 bytes of stack frame space for ppc64, according to
64-bit PowerPC ELF ABI supplement 1.9.
- Use RFI macro.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/init.c | 4 ++ arch/ppc/qemu/start.S | 92 +++++++++++++++++++++++++++++++++++++++- config/examples/ppc_config.xml | 1 + 3 files changed, 96 insertions(+), 1 deletions(-)
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index 0b781d9..d17c843 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -303,6 +303,7 @@ cpu_g4_init(const struct cpudef *cpu) fword("finish-device"); }
+#ifdef CONFIG_PPC_64BITSUPPORT /* In order to get 64 bit aware handlers that rescue all our GPRs from getting truncated to 32 bits, we need to patch the existing handlers so they jump to our 64 bit aware ones. */ @@ -322,6 +323,7 @@ ppc64_patch_handlers(void) asm ( "icbi 0, %0" : : "r"(dsi) ); asm ( "icbi 0, %0" : : "r"(isi) ); } +#endif
static void cpu_970_init(const struct cpudef *cpu) @@ -341,10 +343,12 @@ cpu_970_init(const struct cpudef *cpu)
fword("finish-device");
+#ifdef CONFIG_PPC_64BITSUPPORT /* The 970 is a PPC64 CPU, so we need to activate
- 64bit aware interrupt handlers */
ppc64_patch_handlers(); +#endif
/* The 970 also implements the HIOR which we need to set to 0 */
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index e86bdfd..6cf20cf 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -14,6 +14,7 @@
*/
+#include "autoconf.h" #include "asm/asmdefs.h" #include "asm/processor.h"
@@ -24,6 +25,8 @@ #define ILLEGAL_VECTOR( v ) .org __vectors + v ; vector__##v: bl trap_error ; #define VECTOR( v, dummystr ) .org __vectors + v ; vector__##v
+#ifdef CONFIG_PPC_64BITSUPPORT
/* We're trying to use the same code for the ppc32 and ppc64 handlers here.
- On ppc32 we only save/restore the registers, C considers volatile.
@@ -176,6 +179,89 @@ #undef stl #undef ll
+#else
+#ifdef __powerpc64__
+#define ULONG_SIZE 8 +#define STACKFRAME_MINSIZE 48 +#define stl std +#define ll ld
+#else
+#define ULONG_SIZE 4 +#define STACKFRAME_MINSIZE 16 +#define stl stw +#define ll lwz
+#endif
+.macro EXCEPTION_PREAMBLE
- mtsprg1 r1 /* scratch */
- mfsprg0 r1 /* exception stack in sprg0 */
- addi r1, r1, -(20 * ULONG_SIZE) /* push exception frame */
- stl r0, ( 0 * ULONG_SIZE)(r1) /* save r0 */
- mfsprg1 r0
- stl r0, ( 1 * ULONG_SIZE)(r1) /* save r1 */
- stl r2, ( 2 * ULONG_SIZE)(r1) /* save r2 */
- stl r3, ( 3 * ULONG_SIZE)(r1) /* save r3 */
- stl r4, ( 4 * ULONG_SIZE)(r1)
- stl r5, ( 5 * ULONG_SIZE)(r1)
- stl r6, ( 6 * ULONG_SIZE)(r1)
- stl r7, ( 7 * ULONG_SIZE)(r1)
- stl r8, ( 8 * ULONG_SIZE)(r1)
- stl r9, ( 9 * ULONG_SIZE)(r1)
- stl r10, (10 * ULONG_SIZE)(r1)
- stl r11, (11 * ULONG_SIZE)(r1)
- stl r12, (12 * ULONG_SIZE)(r1)
- mflr r0
- stl r0, (13 * ULONG_SIZE)(r1)
- mfcr r0
- stl r0, (14 * ULONG_SIZE)(r1)
- mfctr r0
- stl r0, (15 * ULONG_SIZE)(r1)
- mfxer r0
- stl r0, (16 * ULONG_SIZE)(r1)
- /* 76(r1) unused */
This comment is obviously outdated, just like in the old code path.
- addi r1, r1, -STACKFRAME_MINSIZE /* C ABI saves LR and SP */
+.endm
+.macro EXCEPTION_EPILOGUE
- addi r1, r1, STACKFRAME_MINSIZE /* pop ABI frame */
- ll r0, (13 * ULONG_SIZE)(r1)
- mtlr r0
- ll r0, (14 * ULONG_SIZE)(r1)
- mtcr r0
- ll r0, (15 * ULONG_SIZE)(r1)
- mtctr r0
- ll r0, (16 * ULONG_SIZE)(r1)
- mtxer r0
- ll r0, ( 0 * ULONG_SIZE)(r1)
- ll r2, ( 2 * ULONG_SIZE)(r1)
- ll r3, ( 3 * ULONG_SIZE)(r1)
- ll r4, ( 4 * ULONG_SIZE)(r1)
- ll r5, ( 5 * ULONG_SIZE)(r1)
- ll r6, ( 6 * ULONG_SIZE)(r1)
- ll r7, ( 7 * ULONG_SIZE)(r1)
- ll r8, ( 8 * ULONG_SIZE)(r1)
- ll r9, ( 9 * ULONG_SIZE)(r1)
- ll r10, (10 * ULONG_SIZE)(r1)
- ll r11, (11 * ULONG_SIZE)(r1)
- ll r12, (12 * ULONG_SIZE)(r1)
- ll r1, ( 1 * ULONG_SIZE)(r1) /* restore stack at last */
- RFI
+.endm
I don't think this really belongs in this patch, no? :)
--verbose please. The description is pretty clear on what this patch does and why.
RFI was committed in r955, so can be used here.
It doesn't make sense to me to #ifdef out code without providing the equivalent code path for ppc64, so this new code path does belong here (remember you agreed that having two parallel code paths was supposedly the only way for migration?).
If you're referring to the suggested use of a .macro, please discuss this with Blue in general and not some hundred lines down a patch without referring to the commit message where this is discussed. Blue has been asking me to turn preprocessor macros into C inline functions - this construct here seems the closest equivalent in assembler code and the advantages I see are detailed in the commit message. You have not yet voiced any particular reason to make this a preprocessor macro, other than your personal preference for preprocessor macros, which appears to conflict with Blue's dislike of preprocessor macros and my dislike of unnecessary multi-line preprocessor macros.
Ah, sorry for not being more clear. I assumed the patch was about ripping out the ppc64 support from ppc32 builds. Adding another interrupt handled seemed out of scope. That's pretty much about it :).
Alex
On Wed, Nov 17, 2010 at 7:45 PM, Andreas Färber andreas.faerber@web.de wrote:
Am 17.11.2010 um 02:32 schrieb Alexander Graf:
On 14.11.2010, at 02:48, Andreas Färber wrote:
Having 64-bit support as an option allows users to disable and test it before it gets officially removed.
Fork single-bitness macros EXCEPTION_{PREAMBLE,EPILOGUE}: Use assembler macros for things that are constant to avoid ; and , and use preprocessor macros to handle differences. Adopt QEMU coding style for new code.
Functional changes for ppc64:
- Don't clear MSR in preamble.
- Just save the minimum number of registers since 64-bit code will
save the full registers.
- Reserve 48 bytes of stack frame space for ppc64, according to
64-bit PowerPC ELF ABI supplement 1.9.
- Use RFI macro.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/init.c | 4 ++ arch/ppc/qemu/start.S | 92 +++++++++++++++++++++++++++++++++++++++- config/examples/ppc_config.xml | 1 + 3 files changed, 96 insertions(+), 1 deletions(-)
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index 0b781d9..d17c843 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -303,6 +303,7 @@ cpu_g4_init(const struct cpudef *cpu) fword("finish-device"); }
+#ifdef CONFIG_PPC_64BITSUPPORT /* In order to get 64 bit aware handlers that rescue all our GPRs from getting truncated to 32 bits, we need to patch the existing handlers so they jump to our 64 bit aware ones. */ @@ -322,6 +323,7 @@ ppc64_patch_handlers(void) asm ( "icbi 0, %0" : : "r"(dsi) ); asm ( "icbi 0, %0" : : "r"(isi) ); } +#endif
static void cpu_970_init(const struct cpudef *cpu) @@ -341,10 +343,12 @@ cpu_970_init(const struct cpudef *cpu)
fword("finish-device");
+#ifdef CONFIG_PPC_64BITSUPPORT /* The 970 is a PPC64 CPU, so we need to activate * 64bit aware interrupt handlers */
ppc64_patch_handlers(); +#endif
/* The 970 also implements the HIOR which we need to set to 0 */
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index e86bdfd..6cf20cf 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -14,6 +14,7 @@
*/
+#include "autoconf.h" #include "asm/asmdefs.h" #include "asm/processor.h"
@@ -24,6 +25,8 @@ #define ILLEGAL_VECTOR( v ) .org __vectors + v ; vector__##v: bl trap_error ; #define VECTOR( v, dummystr ) .org __vectors + v ; vector__##v
+#ifdef CONFIG_PPC_64BITSUPPORT
/* We're trying to use the same code for the ppc32 and ppc64 handlers here.
- On ppc32 we only save/restore the registers, C considers volatile.
@@ -176,6 +179,89 @@ #undef stl #undef ll
+#else
+#ifdef __powerpc64__
+#define ULONG_SIZE 8 +#define STACKFRAME_MINSIZE 48 +#define stl std +#define ll ld
+#else
+#define ULONG_SIZE 4 +#define STACKFRAME_MINSIZE 16 +#define stl stw +#define ll lwz
+#endif
+.macro EXCEPTION_PREAMBLE
- mtsprg1 r1 /* scratch */
- mfsprg0 r1 /* exception stack in sprg0 */
- addi r1, r1, -(20 * ULONG_SIZE) /* push exception frame */
- stl r0, ( 0 * ULONG_SIZE)(r1) /* save r0 */
- mfsprg1 r0
- stl r0, ( 1 * ULONG_SIZE)(r1) /* save r1 */
- stl r2, ( 2 * ULONG_SIZE)(r1) /* save r2 */
- stl r3, ( 3 * ULONG_SIZE)(r1) /* save r3 */
- stl r4, ( 4 * ULONG_SIZE)(r1)
- stl r5, ( 5 * ULONG_SIZE)(r1)
- stl r6, ( 6 * ULONG_SIZE)(r1)
- stl r7, ( 7 * ULONG_SIZE)(r1)
- stl r8, ( 8 * ULONG_SIZE)(r1)
- stl r9, ( 9 * ULONG_SIZE)(r1)
- stl r10, (10 * ULONG_SIZE)(r1)
- stl r11, (11 * ULONG_SIZE)(r1)
- stl r12, (12 * ULONG_SIZE)(r1)
- mflr r0
- stl r0, (13 * ULONG_SIZE)(r1)
- mfcr r0
- stl r0, (14 * ULONG_SIZE)(r1)
- mfctr r0
- stl r0, (15 * ULONG_SIZE)(r1)
- mfxer r0
- stl r0, (16 * ULONG_SIZE)(r1)
- /* 76(r1) unused */
This comment is obviously outdated, just like in the old code path.
- addi r1, r1, -STACKFRAME_MINSIZE /* C ABI saves LR and SP */
+.endm
+.macro EXCEPTION_EPILOGUE
- addi r1, r1, STACKFRAME_MINSIZE /* pop ABI frame */
- ll r0, (13 * ULONG_SIZE)(r1)
- mtlr r0
- ll r0, (14 * ULONG_SIZE)(r1)
- mtcr r0
- ll r0, (15 * ULONG_SIZE)(r1)
- mtctr r0
- ll r0, (16 * ULONG_SIZE)(r1)
- mtxer r0
- ll r0, ( 0 * ULONG_SIZE)(r1)
- ll r2, ( 2 * ULONG_SIZE)(r1)
- ll r3, ( 3 * ULONG_SIZE)(r1)
- ll r4, ( 4 * ULONG_SIZE)(r1)
- ll r5, ( 5 * ULONG_SIZE)(r1)
- ll r6, ( 6 * ULONG_SIZE)(r1)
- ll r7, ( 7 * ULONG_SIZE)(r1)
- ll r8, ( 8 * ULONG_SIZE)(r1)
- ll r9, ( 9 * ULONG_SIZE)(r1)
- ll r10, (10 * ULONG_SIZE)(r1)
- ll r11, (11 * ULONG_SIZE)(r1)
- ll r12, (12 * ULONG_SIZE)(r1)
- ll r1, ( 1 * ULONG_SIZE)(r1) /* restore stack at last */
- RFI
+.endm
I don't think this really belongs in this patch, no? :)
--verbose please. The description is pretty clear on what this patch does and why.
RFI was committed in r955, so can be used here.
It doesn't make sense to me to #ifdef out code without providing the equivalent code path for ppc64, so this new code path does belong here (remember you agreed that having two parallel code paths was supposedly the only way for migration?).
If you're referring to the suggested use of a .macro, please discuss this with Blue in general and not some hundred lines down a patch without referring to the commit message where this is discussed. Blue has been asking me to turn preprocessor macros into C inline functions - this construct here seems the closest equivalent in assembler code and the advantages I see are detailed in the commit message. You have not yet voiced any particular reason to make this a preprocessor macro, other than your personal preference for preprocessor macros, which appears to conflict with Blue's dislike of preprocessor macros and my dislike of unnecessary multi-line preprocessor macros.
I was going to comment that actually I haven't seen much gas macro use in Linux. But then I ran grep and while there really are no macros for Sparc (or x86_64), other architectures use them a lot. So macros are OK for me.
Am 17.11.2010 um 20:58 schrieb Blue Swirl:
On Wed, Nov 17, 2010 at 7:45 PM, Andreas Färber <andreas.faerber@web.de
wrote: Am 17.11.2010 um 02:32 schrieb Alexander Graf:
On 14.11.2010, at 02:48, Andreas Färber wrote:
Having 64-bit support as an option allows users to disable and test it before it gets officially removed.
Fork single-bitness macros EXCEPTION_{PREAMBLE,EPILOGUE}: Use assembler macros for things that are constant to avoid ; and , and use preprocessor macros to handle differences. Adopt QEMU coding style for new code.
Functional changes for ppc64:
- Don't clear MSR in preamble.
- Just save the minimum number of registers since 64-bit code will
save the full registers.
- Reserve 48 bytes of stack frame space for ppc64, according to
64-bit PowerPC ELF ABI supplement 1.9.
- Use RFI macro.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/init.c | 4 ++ arch/ppc/qemu/start.S | 92 +++++++++++++++++++++++++++++++++++++++- config/examples/ppc_config.xml | 1 + 3 files changed, 96 insertions(+), 1 deletions(-)
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index 0b781d9..d17c843 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -303,6 +303,7 @@ cpu_g4_init(const struct cpudef *cpu) fword("finish-device"); }
+#ifdef CONFIG_PPC_64BITSUPPORT /* In order to get 64 bit aware handlers that rescue all our GPRs from getting truncated to 32 bits, we need to patch the existing handlers so they jump to our 64 bit aware ones. */ @@ -322,6 +323,7 @@ ppc64_patch_handlers(void) asm ( "icbi 0, %0" : : "r"(dsi) ); asm ( "icbi 0, %0" : : "r"(isi) ); } +#endif
static void cpu_970_init(const struct cpudef *cpu) @@ -341,10 +343,12 @@ cpu_970_init(const struct cpudef *cpu)
fword("finish-device");
+#ifdef CONFIG_PPC_64BITSUPPORT /* The 970 is a PPC64 CPU, so we need to activate
- 64bit aware interrupt handlers */
ppc64_patch_handlers(); +#endif
/* The 970 also implements the HIOR which we need to set to 0 */
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index e86bdfd..6cf20cf 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -14,6 +14,7 @@
*/
+#include "autoconf.h" #include "asm/asmdefs.h" #include "asm/processor.h"
@@ -24,6 +25,8 @@ #define ILLEGAL_VECTOR( v ) .org __vectors + v ; vector__##v: bl trap_error ; #define VECTOR( v, dummystr ) .org __vectors + v ; vector__##v
+#ifdef CONFIG_PPC_64BITSUPPORT
/* We're trying to use the same code for the ppc32 and ppc64 handlers here.
- On ppc32 we only save/restore the registers, C considers
volatile.
@@ -176,6 +179,89 @@ #undef stl #undef ll
+#else
+#ifdef __powerpc64__
+#define ULONG_SIZE 8 +#define STACKFRAME_MINSIZE 48 +#define stl std +#define ll ld
+#else
+#define ULONG_SIZE 4 +#define STACKFRAME_MINSIZE 16 +#define stl stw +#define ll lwz
+#endif
+.macro EXCEPTION_PREAMBLE
- mtsprg1 r1 /* scratch */
- mfsprg0 r1 /* exception stack in sprg0 */
- addi r1, r1, -(20 * ULONG_SIZE) /* push exception frame */
- stl r0, ( 0 * ULONG_SIZE)(r1) /* save r0 */
- mfsprg1 r0
- stl r0, ( 1 * ULONG_SIZE)(r1) /* save r1 */
- stl r2, ( 2 * ULONG_SIZE)(r1) /* save r2 */
- stl r3, ( 3 * ULONG_SIZE)(r1) /* save r3 */
- stl r4, ( 4 * ULONG_SIZE)(r1)
- stl r5, ( 5 * ULONG_SIZE)(r1)
- stl r6, ( 6 * ULONG_SIZE)(r1)
- stl r7, ( 7 * ULONG_SIZE)(r1)
- stl r8, ( 8 * ULONG_SIZE)(r1)
- stl r9, ( 9 * ULONG_SIZE)(r1)
- stl r10, (10 * ULONG_SIZE)(r1)
- stl r11, (11 * ULONG_SIZE)(r1)
- stl r12, (12 * ULONG_SIZE)(r1)
- mflr r0
- stl r0, (13 * ULONG_SIZE)(r1)
- mfcr r0
- stl r0, (14 * ULONG_SIZE)(r1)
- mfctr r0
- stl r0, (15 * ULONG_SIZE)(r1)
- mfxer r0
- stl r0, (16 * ULONG_SIZE)(r1)
- /* 76(r1) unused */
This comment is obviously outdated, just like in the old code path.
- addi r1, r1, -STACKFRAME_MINSIZE /* C ABI saves LR and SP */
+.endm
+.macro EXCEPTION_EPILOGUE
- addi r1, r1, STACKFRAME_MINSIZE /* pop ABI frame */
- ll r0, (13 * ULONG_SIZE)(r1)
- mtlr r0
- ll r0, (14 * ULONG_SIZE)(r1)
- mtcr r0
- ll r0, (15 * ULONG_SIZE)(r1)
- mtctr r0
- ll r0, (16 * ULONG_SIZE)(r1)
- mtxer r0
- ll r0, ( 0 * ULONG_SIZE)(r1)
- ll r2, ( 2 * ULONG_SIZE)(r1)
- ll r3, ( 3 * ULONG_SIZE)(r1)
- ll r4, ( 4 * ULONG_SIZE)(r1)
- ll r5, ( 5 * ULONG_SIZE)(r1)
- ll r6, ( 6 * ULONG_SIZE)(r1)
- ll r7, ( 7 * ULONG_SIZE)(r1)
- ll r8, ( 8 * ULONG_SIZE)(r1)
- ll r9, ( 9 * ULONG_SIZE)(r1)
- ll r10, (10 * ULONG_SIZE)(r1)
- ll r11, (11 * ULONG_SIZE)(r1)
- ll r12, (12 * ULONG_SIZE)(r1)
- ll r1, ( 1 * ULONG_SIZE)(r1) /* restore stack at last */
- RFI
+.endm
I don't think this really belongs in this patch, no? :)
--verbose please. The description is pretty clear on what this patch does and why.
RFI was committed in r955, so can be used here.
It doesn't make sense to me to #ifdef out code without providing the equivalent code path for ppc64, so this new code path does belong here (remember you agreed that having two parallel code paths was supposedly the only way for migration?).
If you're referring to the suggested use of a .macro, please discuss this with Blue in general and not some hundred lines down a patch without referring to the commit message where this is discussed. Blue has been asking me to turn preprocessor macros into C inline functions - this construct here seems the closest equivalent in assembler code and the advantages I see are detailed in the commit message. You have not yet voiced any particular reason to make this a preprocessor macro, other than your personal preference for preprocessor macros, which appears to conflict with Blue's dislike of preprocessor macros and my dislike of unnecessary multi-line preprocessor macros.
I was going to comment that actually I haven't seen much gas macro use in Linux. But then I ran grep and while there really are no macros for Sparc (or x86_64), other architectures use them a lot. So macros are OK for me.
Thanks, applied slightly modified version in r962.
Andreas
[snip]
Last night I finally made some small progress with ppc64: I picked up Alex' suggestion of using slbmte, this does work for ppc but didn't make a big change for ppc64. The 0x700 program exception turned out to be caused by a jump to the isi_exception function descriptor rather than the isi_exception() function. (Yet, the setup_mmu() function did not seem to have a function descriptor, despite both sitting in C code...)
I now get a 0x380 data segment exception, which seems caused by uses of TOC offsets in entry() with r2 being zero, leading to data accesses wrapping around into unmapped memory.
I thought we might be missing some ELF sections in the linker script but my tries based on `powerpc64-linux-gnu-ld --verbose` were unsuccessful. Is there a way to turn on warnings for sections dropped, to rule this out? Who's responsible for r2 setup - GCC-generated code or QEMU?
I'll flush my Forth queue now and will try to put together some more RFCs.
Andreas
On 15.11.2010, at 21:56, Andreas Färber wrote:
[snip]
Last night I finally made some small progress with ppc64: I picked up Alex' suggestion of using slbmte, this does work for ppc but didn't make a big change for ppc64. The 0x700 program exception turned out to be caused by a jump to the isi_exception function descriptor rather than the isi_exception() function. (Yet, the setup_mmu() function did not seem to have a function descriptor, despite both sitting in C code...)
I now get a 0x380 data segment exception, which seems caused by uses of TOC offsets in entry() with r2 being zero, leading to data accesses wrapping around into unmapped memory.
I thought we might be missing some ELF sections in the linker script but my tries based on `powerpc64-linux-gnu-ld --verbose` were unsuccessful. Is there a way to turn on warnings for sections dropped, to rule this out? Who's responsible for r2 setup - GCC-generated code or QEMU?
r2 contains the GOT IIRC. But for ABI questions, it's probably best to consult Segher :). Unless I'm completely mistaken, usually the loader initializes r2, so in this case the asm code needs to set it up properly.
Alex
I thought we might be missing some ELF sections in the linker script but my tries based on `powerpc64-linux-gnu-ld --verbose` were unsuccessful. Is there a way to turn on warnings for sections dropped, to rule this out?
ld -M will show you everything you ever wanted to know, and much more. readelf -Wa is useful for many problems as well.
Who's responsible for r2 setup - GCC-generated code or QEMU?
r2 contains the GOT IIRC. But for ABI questions, it's probably best to consult Segher :). Unless I'm completely mistaken, usually the loader initializes r2, so in this case the asm code needs to set it up properly.
Depends what you call "loader". Usually your crt1 equivalent sets GPR2. It is probably a good idea to set it in all exception handlers as well (if they want to call C code, or need it otherwise).
Segher
Am 16.11.2010 um 00:02 schrieb Segher Boessenkool:
Who's responsible for r2 setup - GCC-generated code or QEMU?
r2 contains the GOT IIRC. But for ABI questions, it's probably best to consult Segher :). Unless I'm completely mistaken, usually the loader initializes r2, so in this case the asm code needs to set it up properly.
Depends what you call "loader". Usually your crt1 equivalent sets GPR2. It is probably a good idea to set it in all exception handlers as well (if they want to call C code, or need it otherwise).
But set it to what value? :)
Andreas
On 16.11.2010, at 00:11, Andreas Färber wrote:
Am 16.11.2010 um 00:02 schrieb Segher Boessenkool:
Who's responsible for r2 setup - GCC-generated code or QEMU?
r2 contains the GOT IIRC. But for ABI questions, it's probably best to consult Segher :). Unless I'm completely mistaken, usually the loader initializes r2, so in this case the asm code needs to set it up properly.
Depends what you call "loader". Usually your crt1 equivalent sets GPR2. It is probably a good idea to set it in all exception handlers as well (if they want to call C code, or need it otherwise).
But set it to what value? :)
This is what Linux does:
arch/powerpc/kernel/head_64.S:
/* * This puts the TOC pointer into r2, offset by 0x8000 (as expected * by the toolchain). It computes the correct value for wherever we * are running at the moment, using position-independent code. */ _GLOBAL(relative_toc) mflr r0 bcl 20,31,$+4 0: mflr r9 ld r2,(p_toc - 0b)(r9) add r2,r2,r9 mtlr r0 blr
p_toc: .llong __toc_start + 0x8000 - 0b
arch/powerpc/kernel/vmlinux.lds:
.got : AT(ADDR(.got) - (0xc000000000000000 -0x00000000)) { __toc_start = .; *(.got) *(.toc)
Maybe you can get away with default names somehow, not sure :).
Alex
Am 16.11.2010 um 00:02 schrieb Segher Boessenkool:
Who's responsible for r2 setup - GCC-generated code or QEMU?
r2 contains the GOT IIRC. But for ABI questions, it's probably best to consult Segher :). Unless I'm completely mistaken, usually the loader initializes r2, so in this case the asm code needs to set it up properly.
Depends what you call "loader". Usually your crt1 equivalent sets GPR2. It is probably a good idea to set it in all exception handlers as well (if they want to call C code, or need it otherwise).
But set it to what value? :)
TOC base + 0x8000.
On an ELF binary like your Linux programs, this is actually stored in the function descriptor pointed to by e_entry.
Segher
Ugly but functional! --- arch/ppc/qemu/start.S | 24 ++++++++++++++++++++++++ arch/ppc64/qemu/ldscript | 10 ++++++++++ 2 files changed, 34 insertions(+), 0 deletions(-)
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index eef4293..a3b727d 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -272,12 +272,18 @@ GLOBL(__vectors):
call_dsi_exception: LOAD_REG_IMMEDIATE(r3, dsi_exception) +#ifdef __powerpc64__ + ld r3, 0(r3) +#endif mtctr r3 bctrl b exception_return
call_isi_exception: LOAD_REG_IMMEDIATE(r3, isi_exception) +#ifdef __powerpc64__ + ld r3, 0(r3) +#endif mtctr r3 bctrl b exception_return @@ -289,7 +295,11 @@ exception_return: __divide_error: trap_error: mflr r3 +#ifdef __powerpc64__ + b .unexpected_excep +#else b unexpected_excep +#endif
VECTOR( 0x100, "SRE" ): b _entry @@ -445,8 +455,22 @@ GLOBL(_entry):
/* save memory size in stack */
+#ifdef __powerpc64__ + LOAD_REG_IMMEDIATE(r2, __toc_start) + addi r2, r2, 0x4000 + addi r2, r2, 0x4000 +#endif + +#ifdef __powerpc64__ + bl .setup_mmu +#else bl setup_mmu +#endif +#ifdef __powerpc64__ + bl .entry +#else bl entry +#endif 1: nop b 1b
diff --git a/arch/ppc64/qemu/ldscript b/arch/ppc64/qemu/ldscript index 1d8aa8e..28f0b69 100644 --- a/arch/ppc64/qemu/ldscript +++ b/arch/ppc64/qemu/ldscript @@ -41,8 +41,18 @@ SECTIONS _data = .; *(.data) *(.data.*) + *(.toc1) + *(.branch_lt) _edata = .; } + .opd : { + *(.opd) + } + .got : { + __toc_start = .; + *(.got) + *(.toc) + }
.bss ALIGN(4096): { _bss = .;
On Tue, Nov 23, 2010 at 8:00 AM, Andreas Färber andreas.faerber@web.de wrote:
Ugly but functional!
arch/ppc/qemu/start.S | 24 ++++++++++++++++++++++++ arch/ppc64/qemu/ldscript | 10 ++++++++++ 2 files changed, 34 insertions(+), 0 deletions(-)
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index eef4293..a3b727d 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -272,12 +272,18 @@ GLOBL(__vectors):
call_dsi_exception: LOAD_REG_IMMEDIATE(r3, dsi_exception) +#ifdef __powerpc64__
- ld r3, 0(r3)
+#endif
How about LOAD_REG_FUNC macro, which automatically performs the load?
mtctr r3 bctrl b exception_return
call_isi_exception: LOAD_REG_IMMEDIATE(r3, isi_exception) +#ifdef __powerpc64__
- ld r3, 0(r3)
+#endif mtctr r3 bctrl b exception_return @@ -289,7 +295,11 @@ exception_return: __divide_error: trap_error: mflr r3 +#ifdef __powerpc64__
- b .unexpected_excep
+#else b unexpected_excep +#endif
This #ifdeffery could be avoided with a macro to add the dot for ppc64: b BRANCH_LABEL(unexpected_excep)
VECTOR( 0x100, "SRE" ): b _entry @@ -445,8 +455,22 @@ GLOBL(_entry):
/* save memory size in stack */
+#ifdef __powerpc64__
- LOAD_REG_IMMEDIATE(r2, __toc_start)
- addi r2, r2, 0x4000
- addi r2, r2, 0x4000
+#endif
+#ifdef __powerpc64__
- bl .setup_mmu
+#else bl setup_mmu +#endif +#ifdef __powerpc64__
- bl .entry
+#else bl entry +#endif 1: nop b 1b
diff --git a/arch/ppc64/qemu/ldscript b/arch/ppc64/qemu/ldscript index 1d8aa8e..28f0b69 100644 --- a/arch/ppc64/qemu/ldscript +++ b/arch/ppc64/qemu/ldscript @@ -41,8 +41,18 @@ SECTIONS _data = .; *(.data) *(.data.*)
- *(.toc1)
- *(.branch_lt)
_edata = .; }
- .opd : {
- *(.opd)
- }
- .got : {
- __toc_start = .;
- *(.got)
- *(.toc)
- }
.bss ALIGN(4096): { _bss = .; -- 1.7.3
-- OpenBIOS http://openbios.org/ Mailinglist: http://lists.openbios.org/mailman/listinfo Free your System - May the Forth be with you
Am 23.11.2010 um 20:52 schrieb Blue Swirl:
On Tue, Nov 23, 2010 at 8:00 AM, Andreas Färber <andreas.faerber@web.de
wrote: Ugly but functional!
arch/ppc/qemu/start.S | 24 ++++++++++++++++++++++++ arch/ppc64/qemu/ldscript | 10 ++++++++++ 2 files changed, 34 insertions(+), 0 deletions(-)
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index eef4293..a3b727d 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -272,12 +272,18 @@ GLOBL(__vectors):
call_dsi_exception: LOAD_REG_IMMEDIATE(r3, dsi_exception) +#ifdef __powerpc64__
ld r3, 0(r3)
+#endif
How about LOAD_REG_FUNC macro, which automatically performs the load?
Yeah, working on it, I had LOAD_REG_ADDR() in an earlier FYI patch. Depending on whether the dotted name is globally visible we could just load that instead and spare us the ld.
@@ -289,7 +295,11 @@ exception_return: __divide_error: trap_error: mflr r3 +#ifdef __powerpc64__
b .unexpected_excep
+#else b unexpected_excep +#endif
This #ifdeffery could be avoided with a macro to add the dot for ppc64: b BRANCH_LABEL(unexpected_excep)
Right. Alex suggested the same. I like your macro name. During local testing I do find my #ifdef'ery more convenient for going back and forth.
Andreas
In the ppc64 ELF ABI, similar to ia64, a function's symbol does not directly precede its machine instructions. For ppc64 it consists of a triple of entry point, TOC base and environment pointer.
Introduce macros to facilitate handling this. Names suggested by Blue.
Deliberately don't touch the client interface yet, as there's more work to do.
Cc: Alexander Graf agraf@suse.de Cc: Blue Swirl blauwirbel@gmail.com Signed-off-by: Andreas Färber andreas.faerber@web.de --- arch/ppc/qemu/start.S | 10 +++++----- include/arch/ppc/asmdefs.h | 13 +++++++++++++ 2 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index eef4293..f5c2f24 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -271,13 +271,13 @@ GLOBL(__vectors): b 1b
call_dsi_exception: - LOAD_REG_IMMEDIATE(r3, dsi_exception) + LOAD_REG_FUNC(r3, dsi_exception) mtctr r3 bctrl b exception_return
call_isi_exception: - LOAD_REG_IMMEDIATE(r3, isi_exception) + LOAD_REG_FUNC(r3, isi_exception) mtctr r3 bctrl b exception_return @@ -289,7 +289,7 @@ exception_return: __divide_error: trap_error: mflr r3 - b unexpected_excep + b BRANCH_LABEL(unexpected_excep)
VECTOR( 0x100, "SRE" ): b _entry @@ -445,8 +445,8 @@ GLOBL(_entry):
/* save memory size in stack */
- bl setup_mmu - bl entry + bl BRANCH_LABEL(setup_mmu) + bl BRANCH_LABEL(entry) 1: nop b 1b
diff --git a/include/arch/ppc/asmdefs.h b/include/arch/ppc/asmdefs.h index 51570ea..9c85ea5 100644 --- a/include/arch/ppc/asmdefs.h +++ b/include/arch/ppc/asmdefs.h @@ -76,24 +76,37 @@ /************************************************************************/
#ifdef __powerpc64__ + #define LOAD_REG_IMMEDIATE(D, x) \ lis (D), (x)@highest ; \ ori (D), (D), (x)@higher ; \ sldi (D), (D), 32 ; \ oris (D), (D), (x)@h ; \ ori (D), (D), (x)@l + +#define LOAD_REG_FUNC(D, x) \ + LOAD_REG_IMMEDIATE((D), (x)) ; \ + ld (D), 0(D) + #else + #define LOAD_REG_IMMEDIATE(D, x) \ lis (D), HA(x) ; \ addi (D), (D), LO(x) + +#define LOAD_REG_FUNC(D, x) \ + LOAD_REG_IMMEDIATE((D), (x)) + #endif
#ifdef __powerpc64__ #define RFI rfid #define MTMSRD(r) mtmsrd r +#define BRANCH_LABEL(name) . ## name #else #define RFI rfi #define MTMSRD(r) mtmsr r +#define BRANCH_LABEL(name) name #endif
#ifndef __darwin__
r2 points to TOC base, __toc_start + 0x8000. This value is stored as part of the function descriptor.
Include some related ELF sections in the linker script.
Cc: Alexander Graf agraf@suse.de Cc: Segher Boessenkool segher@kernel.crashing.org Signed-off-by: Andreas Färber andreas.faerber@web.de --- arch/ppc/qemu/start.S | 7 +++++++ arch/ppc64/qemu/ldscript | 10 ++++++++++ 2 files changed, 17 insertions(+), 0 deletions(-)
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index f5c2f24..4b6df3f 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -445,6 +445,13 @@ GLOBL(_entry):
/* save memory size in stack */
+#ifdef __powerpc64__ + /* set up TOC pointer */ + + LOAD_REG_IMMEDIATE(r2, setup_mmu) + ld r2, 8(r2) +#endif + bl BRANCH_LABEL(setup_mmu) bl BRANCH_LABEL(entry) 1: nop diff --git a/arch/ppc64/qemu/ldscript b/arch/ppc64/qemu/ldscript index 1d8aa8e..7a22903 100644 --- a/arch/ppc64/qemu/ldscript +++ b/arch/ppc64/qemu/ldscript @@ -41,8 +41,18 @@ SECTIONS _data = .; *(.data) *(.data.*) + *(.toc1) + *(.branch_lt) _edata = .; } + .opd : { + *(.opd) + } + .got : { + __toc_start = .; + *(.got) + *(.toc) + }
.bss ALIGN(4096): { _bss = .;
Am 23.11.2010 um 22:40 schrieb Andreas Färber:
r2 points to TOC base, __toc_start + 0x8000. This value is stored as part of the function descriptor.
Include some related ELF sections in the linker script.
Cc: Alexander Graf agraf@suse.de Cc: Segher Boessenkool segher@kernel.crashing.org Signed-off-by: Andreas Färber andreas.faerber@web.de
If no one complains, I'll apply these two tomorrow.
Next in line are the addressing/ofmem issues. Once those are in I'm going to do a round of small ppc cleanups.
Andreas
Am 23.11.2010 um 22:40 schrieb Andreas Färber:
r2 points to TOC base, __toc_start + 0x8000. This value is stored as part of the function descriptor.
Include some related ELF sections in the linker script.
Cc: Alexander Graf agraf@suse.de Cc: Segher Boessenkool segher@kernel.crashing.org Signed-off-by: Andreas Färber andreas.faerber@web.de
Applied as r968-969.
Macro name as seen in Linux. Use of macro suggested by Alex.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de --- arch/ppc/qemu/start.S | 2 +- include/arch/ppc/asmdefs.h | 6 ++++++ 2 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index 4db6462..e86bdfd 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -234,7 +234,7 @@ VECTOR( 0x800, "FPU" ): ori r3,r3,0x2000 mtsrr1 r3 mfsprg1 r3 - rfi + RFI
ILLEGAL_VECTOR( 0x900 ) ILLEGAL_VECTOR( 0xa00 ) diff --git a/include/arch/ppc/asmdefs.h b/include/arch/ppc/asmdefs.h index 9da4124..4e22156 100644 --- a/include/arch/ppc/asmdefs.h +++ b/include/arch/ppc/asmdefs.h @@ -88,6 +88,12 @@ addi (D), (D), LO(x) #endif
+#ifdef __powerpc64__ +#define RFI rfid +#else +#define RFI rfi +#endif + #ifndef __darwin__ #define GLOBL( name ) .globl name ; name #define EXTERN( name ) name
On 13.11.2010, at 14:47, Andreas Färber wrote:
Macro name as seen in Linux. Use of macro suggested by Alex.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
Signed-off-by: Alexander Graf agraf@suse.de
Am 13.11.2010 um 16:37 schrieb Alexander Graf:
On 13.11.2010, at 14:47, Andreas Färber wrote:
Macro name as seen in Linux. Use of macro suggested by Alex.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
Signed-off-by: Alexander Graf agraf@suse.de
Thanks, applied in r955.
Andreas
On 07.11.2010, at 23:13, Andreas Färber wrote:
Am 01.11.2010 um 17:36 schrieb Andreas Färber:
Latest state with local patches is that hell breaks loose once the MMU is set up. I get a 0x400 (ISI) exception and when the bctrl to isi_exception() is executed, we end up at trap_error, where it branches to unexpected_excep() and tries to printk() to the serial port that's not yet set up. I'll put a few patches together.
Since r945 everything except for the trampoline issue should be in SVN.
I've made no more progress throughout the week though:
Directly after we set the MSR_IR|MSR_DR bits in the MSR (arch/ppc/qemu/ofmem.c:setup_mmu), we get an ISI exception and end up in arch/ppc/qemu/start.S:vector__0x400 (the 0xfffxxxxx one). We proceed up to the bctrl which should take us to arch/ppc/qemu/ofmem.c:isi_exception, but then get a 0x700 program exception. The value in ctr looks sensible, it's some 0xfffxxxxx address.
i) I read that mtsrin were not allowed in 64-bit mode and its results unpredictable, so I tried switching MSR_SF off before and back on after the loop, without luck.
ii) If I exit the setup_mmu() function without turning the MMU on, we proceed to arch/ppc/qemu/init.c:entry() but are unsuccessful reading the magic fw_cfg signature. Stepping through the code it seemed as if some variable assignments like in drivers/fw_cfg.c:fw_cfg_init() were having no effect - could that be due to OpenBIOS code execution happening in ROM rather than ea_to_phys()-mapped to RAM? (i.e., write-only storage?:)) Or would this be some memory caching issue for the fw_cfg ports?
iii) Before turning on the MMU, I tried implementing the early-mapping of pages by calling hash_page() from ofmem_arch_early_map_pages() and calling ofmem_map() for the ROM-to-RAM translation and for identity-mapping the code. This leads to a hang in libopenbios/ofmem_common.c:ofmem_update_memory_available() in a code path (a printk in ofmem_realloc()) that would normally only be taken if libopenbios/ofmem_common.c:s_phandle_memory were non-zero, at a point where it should still be zero.
Any clue why ppc works but ppc64 doesn't?
You could try to enable the debug code in target-ppc/helper.c :). That maybe tells you more.
Alex
On 07.11.2010, at 23:13, Andreas Färber wrote:
Am 01.11.2010 um 17:36 schrieb Andreas Färber:
Latest state with local patches is that hell breaks loose once the MMU is set up. I get a 0x400 (ISI) exception and when the bctrl to isi_exception() is executed, we end up at trap_error, where it branches to unexpected_excep() and tries to printk() to the serial port that's not yet set up. I'll put a few patches together.
Since r945 everything except for the trampoline issue should be in SVN.
I've made no more progress throughout the week though:
Directly after we set the MSR_IR|MSR_DR bits in the MSR (arch/ppc/qemu/ofmem.c:setup_mmu), we get an ISI exception and end up in arch/ppc/qemu/start.S:vector__0x400 (the 0xfffxxxxx one). We proceed up to the bctrl which should take us to arch/ppc/qemu/ofmem.c:isi_exception, but then get a 0x700 program exception. The value in ctr looks sensible, it's some 0xfffxxxxx address.
i) I read that mtsrin were not allowed in 64-bit mode and its results unpredictable, so I tried switching MSR_SF off before and back on after the loop, without luck.
The mtsrin implementation is a hack. Most PPC cores don't support it at all anymore. But qemu and kvm are fine, so it's a very easy way of setting up SLB entries.
If you want to go the "correct" route, just convert mtsrin to slbmte. That way we could potentially run on real hardware too ;-).
Alex
Set up SLBs with slbmte instead of mtsrin, suggested by Alex. Adopt SLB example code from IBM application note.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de --- arch/ppc/qemu/ofmem.c | 35 ++++++++++++++++++++++++++++++++--- 1 files changed, 32 insertions(+), 3 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index e8b0b24..85b9956 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -393,7 +393,7 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem; - unsigned long sdr1, sr_base, msr; + unsigned long sdr1, msr; unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i; @@ -405,13 +405,42 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#ifdef __powerpc64__ +#define SLB_SIZE 64 +#else +#define SLB_SIZE 16 +#endif +#if 1//def __powerpc64__ +#if 1 + /* Initialize SLBs */ + for (i = 0; i < SLB_SIZE; i++) { + unsigned long rs = (i << 12) | (0 << 7); + unsigned long rb = ((unsigned long)i << 28) | (0 << 27) | i; + asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory"); + } + /* Invalidate SLBs */ + for (i = 1; i < SLB_SIZE; i++) { + unsigned long rb = ((unsigned long)i << 28) | (0 << 27); + asm volatile("slbie %0" :: "r" (rb) : "memory"); + } +#endif + /* Set SLBs */ + for (i = 0; i < 16; i++) { + unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7); + unsigned long rb = ((unsigned long)i << 28) | (1 << 27) | i; + asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory"); + } + asm volatile("isync" ::: "memory"); +#else /* Segment Register */ - - sr_base = SEGR_USER | SEGR_BASE ; + { + unsigned long sr_base = SEGR_USER | SEGR_BASE ; for( i=0; i<16; i++ ) { int j = i << 28; asm volatile("mtsrin %0,%1" :: "r" (sr_base + i), "r" (j) ); } + } +#endif
ofmem = ofmem_arch_get_private(); memset(ofmem, 0, sizeof(ofmem_t));
Dereference function descriptors. --- Just putting this out there. In addition to function descriptor deref (still in need of macros like LOAD_REG_ADDR_IMMEDIATE and LOAD_REG_ADDR) I checked whether it makes any difference whether we simulate the original ba by a bctr - not noticably. Also FYI my unsuccessful ldscript attempts.
arch/ppc/qemu/start.S | 30 ++++++++++++++++++++++++++++++ arch/ppc64/qemu/ldscript | 43 +++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 71 insertions(+), 2 deletions(-)
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index 6cf20cf..1a63082 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -274,15 +274,33 @@ GLOBL(__vectors):
call_dsi_exception: LOAD_REG_IMMEDIATE(r3, dsi_exception) +#ifdef __powerpc64__ + ld r3, 0(r3) +#endif mtctr r3 bctrl +#if 1 + LOAD_REG_IMMEDIATE(r3, exception_return) + mtctr r3 + bctr +#else b exception_return +#endif
call_isi_exception: LOAD_REG_IMMEDIATE(r3, isi_exception) +#ifdef __powerpc64__ + ld r3, 0(r3) +#endif mtctr r3 bctrl +#if 1 + LOAD_REG_IMMEDIATE(r3, exception_return) + mtctr r3 + bctr +#else b exception_return +#endif
exception_return: EXCEPTION_EPILOGUE @@ -363,6 +381,12 @@ GLOBL(__vectors_end): /* entry */ /************************************************************************/
+#ifdef __powerpc64__ +#define LOAD_REG_ADDR(reg, name) ld (reg), name@got(r2) +#else +#define LOAD_REG_ADDR(reg, name) LOAD_REG_IMMEDIATE(reg, name) +#endif + GLOBL(_entry):
#ifdef CONFIG_PPC_64BITSUPPORT @@ -448,7 +472,13 @@ GLOBL(_entry): /* save memory size in stack */
bl setup_mmu +#if 0 + LOAD_REG_ADDR(r3, entry) + mtctr r3 + bctrl +#else bl entry +#endif 1: nop b 1b
diff --git a/arch/ppc64/qemu/ldscript b/arch/ppc64/qemu/ldscript index 1d8aa8e..2849a45 100644 --- a/arch/ppc64/qemu/ldscript +++ b/arch/ppc64/qemu/ldscript @@ -26,9 +26,41 @@ SECTIONS
. = TEXT_ADDR; /* Normal sections */ + .rela.dyn : + { + *(.rela.opd) + *(.rela.init) + *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*) + *(.rela.fini) + *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*) + *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*) + *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*) + *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*) + *(.rela.ctors) + *(.rela.dtors) + *(.rela.got) + *(.rela.toc) + *(.rela.branch_lt) + *(.rela.sdata .rela.sdata.* .rela.gnu.linkonce.s.*) + *(.rela.sbss .rela.sbss.* .rela.gnu.linkonce.sb.*) + *(.rela.sdata2 .rela.sdata2.* .rela.gnu.linkonce.s2.*) + *(.rela.sbss2 .rela.sbss2.* .rela.gnu.linkonce.sb2.*) + *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*) + PROVIDE_HIDDEN (__rel_iplt_start = .); + PROVIDE_HIDDEN (__rel_iplt_end = .); + PROVIDE_HIDDEN (__rela_iplt_start = .); + *(.rela.iplt) + PROVIDE_HIDDEN (__rela_iplt_end = .); + } + .rela.plt : + { + *(.rela.plt) + } + .rela.tocbss : { *(.rela.tocbss) } .text ALIGN(4096): { *(.text) *(.text.*) + *(.sfpr .glink) }
.rodata ALIGN(4096): { @@ -41,13 +73,20 @@ SECTIONS _data = .; *(.data) *(.data.*) - _edata = .; } + .data1 : { *(.data1) } + .toc1 : ALIGN(8) { *(.toc1) } + .opd : ALIGN(8) { KEEP (*(.opd)) } + .got : ALIGN(8) { *(.got .toc) } + _edata = .;
- .bss ALIGN(4096): { _bss = .; + .tocbss : ALIGN(8) { *(.tocbss) } + .bss ALIGN(4096): { *(.sbss) *(.sbss.*) + *(.plt) + *(.iplt) *(.bss) *(.bss.*) *(COMMON)
On 16.11.2010, at 00:39, Andreas Färber wrote:
Dereference function descriptors.
Just putting this out there. In addition to function descriptor deref (still in need of macros like LOAD_REG_ADDR_IMMEDIATE and LOAD_REG_ADDR) I checked whether it makes any difference whether we simulate the original ba by a bctr - not noticably. Also FYI my unsuccessful ldscript attempts.
arch/ppc/qemu/start.S | 30 ++++++++++++++++++++++++++++++ arch/ppc64/qemu/ldscript | 43 +++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 71 insertions(+), 2 deletions(-)
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index 6cf20cf..1a63082 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -274,15 +274,33 @@ GLOBL(__vectors):
call_dsi_exception: LOAD_REG_IMMEDIATE(r3, dsi_exception) +#ifdef __powerpc64__
- ld r3, 0(r3)
+#endif
Have you checked if .dsi_exception is available? Usually the dotted one is the version without indirection.
Alex
Am 16.11.2010 um 00:43 schrieb Alexander Graf:
On 16.11.2010, at 00:39, Andreas Färber wrote:
Dereference function descriptors.
Just putting this out there. In addition to function descriptor deref (still in need of macros like LOAD_REG_ADDR_IMMEDIATE and LOAD_REG_ADDR) I checked whether it makes any difference whether we simulate the original ba by a bctr - not noticably. Also FYI my unsuccessful ldscript attempts.
arch/ppc/qemu/start.S | 30 ++++++++++++++++++++++++++++++ arch/ppc64/qemu/ldscript | 43 ++++++++++++++++++++++++++++++++++++ +++++-- 2 files changed, 71 insertions(+), 2 deletions(-)
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index 6cf20cf..1a63082 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -274,15 +274,33 @@ GLOBL(__vectors):
call_dsi_exception: LOAD_REG_IMMEDIATE(r3, dsi_exception) +#ifdef __powerpc64__
- ld r3, 0(r3)
+#endif
Have you checked if .dsi_exception is available? Usually the dotted one is the version without indirection.
Not yet, but that wouldn't spare us the special-handling for ppc64 either, since ppc wouldn't know the dotted version, right? Should we #define it then? Linux uses LOAD_ADDR_REG() macro, going via GOT, which wouldn't work ATM.
Andreas
On 21.11.2010, at 11:44, Andreas Färber wrote:
Am 16.11.2010 um 00:43 schrieb Alexander Graf:
On 16.11.2010, at 00:39, Andreas Färber wrote:
Dereference function descriptors.
Just putting this out there. In addition to function descriptor deref (still in need of macros like LOAD_REG_ADDR_IMMEDIATE and LOAD_REG_ADDR) I checked whether it makes any difference whether we simulate the original ba by a bctr - not noticably. Also FYI my unsuccessful ldscript attempts.
arch/ppc/qemu/start.S | 30 ++++++++++++++++++++++++++++++ arch/ppc64/qemu/ldscript | 43 +++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 71 insertions(+), 2 deletions(-)
diff --git a/arch/ppc/qemu/start.S b/arch/ppc/qemu/start.S index 6cf20cf..1a63082 100644 --- a/arch/ppc/qemu/start.S +++ b/arch/ppc/qemu/start.S @@ -274,15 +274,33 @@ GLOBL(__vectors):
call_dsi_exception: LOAD_REG_IMMEDIATE(r3, dsi_exception) +#ifdef __powerpc64__
- ld r3, 0(r3)
+#endif
Have you checked if .dsi_exception is available? Usually the dotted one is the version without indirection.
Not yet, but that wouldn't spare us the special-handling for ppc64 either, since ppc wouldn't know the dotted version, right? Should we #define it then? Linux uses LOAD_ADDR_REG() macro, going via GOT, which wouldn't work ATM.
Yeah, in the first couple versions of ppc kvm code, I just #define'd it :). The code definitely is more readable without #ifdefs in the middle of it. Maybe you can even get some clever preprocessor magic to append the dot automatically:
LOAD_CCALL_IMMEDIATE(r3, dsi_exception)
#define LOAD_CCALL_IMMEDIATE(a, b) LOAD_REG_IMMEDIATE(a, . # b)
or so, no idea if it's actually possible :)
Alex
On 16.11.2010, at 00:39, Andreas Färber wrote:
Set up SLBs with slbmte instead of mtsrin, suggested by Alex. Adopt SLB example code from IBM application note.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/ofmem.c | 35 ++++++++++++++++++++++++++++++++--- 1 files changed, 32 insertions(+), 3 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index e8b0b24..85b9956 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -393,7 +393,7 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem;
- unsigned long sdr1, sr_base, msr;
- unsigned long sdr1, msr; unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i;
@@ -405,13 +405,42 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#ifdef __powerpc64__ +#define SLB_SIZE 64 +#else +#define SLB_SIZE 16 +#endif +#if 1//def __powerpc64__ +#if 1
- /* Initialize SLBs */
- for (i = 0; i < SLB_SIZE; i++) {
unsigned long rs = (i << 12) | (0 << 7);
unsigned long rb = ((unsigned long)i << 28) | (0 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory");
- }
- /* Invalidate SLBs */
- for (i = 1; i < SLB_SIZE; i++) {
unsigned long rb = ((unsigned long)i << 28) | (0 << 27);
asm volatile("slbie %0" :: "r" (rb) : "memory");
- }
+#endif
- /* Set SLBs */
- for (i = 0; i < 16; i++) {
unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7);
unsigned long rb = ((unsigned long)i << 28) | (1 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory");
PPC32 doesn't have an SLB, only SRs :). So there you still need mtsrin (or mtsr).
Alex
Am 16.11.2010 um 00:41 schrieb Alexander Graf:
On 16.11.2010, at 00:39, Andreas Färber wrote:
Set up SLBs with slbmte instead of mtsrin, suggested by Alex. Adopt SLB example code from IBM application note.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/ofmem.c | 35 ++++++++++++++++++++++++++++++++--- 1 files changed, 32 insertions(+), 3 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index e8b0b24..85b9956 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -393,7 +393,7 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem;
- unsigned long sdr1, sr_base, msr;
- unsigned long sdr1, msr; unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i;
@@ -405,13 +405,42 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#ifdef __powerpc64__ +#define SLB_SIZE 64 +#else +#define SLB_SIZE 16 +#endif +#if 1//def __powerpc64__ +#if 1
- /* Initialize SLBs */
- for (i = 0; i < SLB_SIZE; i++) {
unsigned long rs = (i << 12) | (0 << 7);
unsigned long rb = ((unsigned long)i << 28) | (0 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) :
"memory");
- }
- /* Invalidate SLBs */
- for (i = 1; i < SLB_SIZE; i++) {
unsigned long rb = ((unsigned long)i << 28) | (0 << 27);
asm volatile("slbie %0" :: "r" (rb) : "memory");
- }
+#endif
- /* Set SLBs */
- for (i = 0; i < 16; i++) {
unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7);
unsigned long rb = ((unsigned long)i << 28) | (1 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) :
"memory");
PPC32 doesn't have an SLB, only SRs :). So there you still need mtsrin (or mtsr).
Thanks for the reminder. Can we agree then that OpenBIOS/ppc64 only needs to care about SLBs? What I've been testing here though is that OpenBIOS/ppc on ppc64- softmmu doesn't break through my changes. And on ppc unsigned long is 32 bits only. :) Obviously needs cleanup, just an RFC.
I'd be interested to hear if this code that I ported is really necessary or correct (first setting them up so that they don't refer to the same *SID, then explicitly invalidate them and then set up the 16 ones we care about for real.
Also, is SLB_SIZE 64 a universal number? This is from a document on the 970 MMU, and it's supposedly implementation-specific. Sounds scary.
Andreas
On 16.11.2010, at 00:52, Andreas Färber wrote:
Am 16.11.2010 um 00:41 schrieb Alexander Graf:
On 16.11.2010, at 00:39, Andreas Färber wrote:
Set up SLBs with slbmte instead of mtsrin, suggested by Alex. Adopt SLB example code from IBM application note.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/ofmem.c | 35 ++++++++++++++++++++++++++++++++--- 1 files changed, 32 insertions(+), 3 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index e8b0b24..85b9956 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -393,7 +393,7 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem;
- unsigned long sdr1, sr_base, msr;
- unsigned long sdr1, msr; unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i;
@@ -405,13 +405,42 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#ifdef __powerpc64__ +#define SLB_SIZE 64 +#else +#define SLB_SIZE 16 +#endif +#if 1//def __powerpc64__ +#if 1
- /* Initialize SLBs */
- for (i = 0; i < SLB_SIZE; i++) {
unsigned long rs = (i << 12) | (0 << 7);
unsigned long rb = ((unsigned long)i << 28) | (0 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory");
- }
- /* Invalidate SLBs */
- for (i = 1; i < SLB_SIZE; i++) {
unsigned long rb = ((unsigned long)i << 28) | (0 << 27);
asm volatile("slbie %0" :: "r" (rb) : "memory");
- }
+#endif
- /* Set SLBs */
- for (i = 0; i < 16; i++) {
unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7);
unsigned long rb = ((unsigned long)i << 28) | (1 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory");
PPC32 doesn't have an SLB, only SRs :). So there you still need mtsrin (or mtsr).
Thanks for the reminder. Can we agree then that OpenBIOS/ppc64 only needs to care about SLBs? What I've been testing here though is that OpenBIOS/ppc on ppc64-softmmu doesn't break through my changes. And on ppc unsigned long is 32 bits only. :) Obviously needs cleanup, just an RFC.
Well, then why don't you leave it at the mtsrin code path? If you're running without MSR_SF you don't need any segments higher than 0xf :).
I'd be interested to hear if this code that I ported is really necessary or correct (first setting them up so that they don't refer to the same *SID, then explicitly invalidate them and then set up the 16 ones we care about for real.
I don't think you need to go through all that effort. For starters, you can just use "slbia" to invalidate all slb entries except for entry 0. That one you can just slbie manually. And then you can usually assume that on RESET, all segments are clear I'd assume :).
I usually prefer to read code instead of specs when it comes to the PPC MMU. So if you like, check out kvm.git. The ppc64 mmu implementation is in arch/powerpc/kvm/book3s_64_mmu.c. Check out kvmppc_mmu_book3s_64_slbmte and you'll quickly see which bits belong where :).
Also, is SLB_SIZE 64 a universal number? This is from a document on the 970 MMU, and it's supposedly implementation-specific. Sounds scary.
It is implementation specific. I don't think it gets smaller than 64 though, so you're good.
Alex
Am 16.11.2010 um 01:00 schrieb Alexander Graf:
On 16.11.2010, at 00:52, Andreas Färber wrote:
Am 16.11.2010 um 00:41 schrieb Alexander Graf:
On 16.11.2010, at 00:39, Andreas Färber wrote:
Set up SLBs with slbmte instead of mtsrin, suggested by Alex. Adopt SLB example code from IBM application note.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/ofmem.c | 35 ++++++++++++++++++++++++++++++++--- 1 files changed, 32 insertions(+), 3 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index e8b0b24..85b9956 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -393,7 +393,7 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem;
- unsigned long sdr1, sr_base, msr;
- unsigned long sdr1, msr; unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i;
@@ -405,13 +405,42 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#ifdef __powerpc64__ +#define SLB_SIZE 64 +#else +#define SLB_SIZE 16 +#endif +#if 1//def __powerpc64__ +#if 1
- /* Initialize SLBs */
- for (i = 0; i < SLB_SIZE; i++) {
unsigned long rs = (i << 12) | (0 << 7);
unsigned long rb = ((unsigned long)i << 28) | (0 << 27)
| i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) :
"memory");
- }
- /* Invalidate SLBs */
- for (i = 1; i < SLB_SIZE; i++) {
unsigned long rb = ((unsigned long)i << 28) | (0 << 27);
asm volatile("slbie %0" :: "r" (rb) : "memory");
- }
+#endif
- /* Set SLBs */
- for (i = 0; i < 16; i++) {
unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7);
unsigned long rb = ((unsigned long)i << 28) | (1 << 27)
| i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) :
"memory");
PPC32 doesn't have an SLB, only SRs :). So there you still need mtsrin (or mtsr).
Thanks for the reminder. Can we agree then that OpenBIOS/ppc64 only needs to care about SLBs? What I've been testing here though is that OpenBIOS/ppc on ppc64- softmmu doesn't break through my changes. And on ppc unsigned long is 32 bits only. :) Obviously needs cleanup, just an RFC.
Well, then why don't you leave it at the mtsrin code path? If you're running without MSR_SF you don't need any segments higher than 0xf :).
Like I said, I needed to *test* that my slbmte code actually works. :) Since ppc64 still doesn't boot any OS, I successfully tested ppc64- softmmu with 32-bit OpenBIOS. Not for HEAD, just a little short on time.
Also, is SLB_SIZE 64 a universal number? This is from a document on the 970 MMU, and it's supposedly implementation-specific. Sounds scary.
It is implementation specific. I don't think it gets smaller than 64 though, so you're good.
Hm, no, since the loop would then leave SLBs >= 64 in undetermined state. But if we can always use slbia it doesn't matter.
Andreas
On 20.11.2010, at 17:28, Andreas Färber wrote:
Am 16.11.2010 um 01:00 schrieb Alexander Graf:
On 16.11.2010, at 00:52, Andreas Färber wrote:
Am 16.11.2010 um 00:41 schrieb Alexander Graf:
On 16.11.2010, at 00:39, Andreas Färber wrote:
Set up SLBs with slbmte instead of mtsrin, suggested by Alex. Adopt SLB example code from IBM application note.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/ofmem.c | 35 ++++++++++++++++++++++++++++++++--- 1 files changed, 32 insertions(+), 3 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index e8b0b24..85b9956 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -393,7 +393,7 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem;
- unsigned long sdr1, sr_base, msr;
- unsigned long sdr1, msr; unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i;
@@ -405,13 +405,42 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#ifdef __powerpc64__ +#define SLB_SIZE 64 +#else +#define SLB_SIZE 16 +#endif +#if 1//def __powerpc64__ +#if 1
- /* Initialize SLBs */
- for (i = 0; i < SLB_SIZE; i++) {
unsigned long rs = (i << 12) | (0 << 7);
unsigned long rb = ((unsigned long)i << 28) | (0 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory");
- }
- /* Invalidate SLBs */
- for (i = 1; i < SLB_SIZE; i++) {
unsigned long rb = ((unsigned long)i << 28) | (0 << 27);
asm volatile("slbie %0" :: "r" (rb) : "memory");
- }
+#endif
- /* Set SLBs */
- for (i = 0; i < 16; i++) {
unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7);
unsigned long rb = ((unsigned long)i << 28) | (1 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory");
PPC32 doesn't have an SLB, only SRs :). So there you still need mtsrin (or mtsr).
Thanks for the reminder. Can we agree then that OpenBIOS/ppc64 only needs to care about SLBs? What I've been testing here though is that OpenBIOS/ppc on ppc64-softmmu doesn't break through my changes. And on ppc unsigned long is 32 bits only. :) Obviously needs cleanup, just an RFC.
Well, then why don't you leave it at the mtsrin code path? If you're running without MSR_SF you don't need any segments higher than 0xf :).
Like I said, I needed to *test* that my slbmte code actually works. :) Since ppc64 still doesn't boot any OS, I successfully tested ppc64-softmmu with 32-bit OpenBIOS. Not for HEAD, just a little short on time.
Also, is SLB_SIZE 64 a universal number? This is from a document on the 970 MMU, and it's supposedly implementation-specific. Sounds scary.
It is implementation specific. I don't think it gets smaller than 64 though, so you're good.
Hm, no, since the loop would then leave SLBs >= 64 in undetermined state. But if we can always use slbia it doesn't matter.
If you do slbia before, SLB >= 64 are in determined state, namely invalid, thus not used. So everything's great :).
Alex
Set up SLBs with slbmte instead of mtsrin, suggested by Alex.
v2: * Don't initialize 64 SLBs, then invalidate them, as in IBM's application note for the 970. Use slbia instead, recommended by Alex. * Conditionalize when to use SLB or SR.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de --- arch/ppc/qemu/ofmem.c | 25 ++++++++++++++++++++++--- 1 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index 72694b3..24f3a25 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -387,7 +387,7 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem; - unsigned long sdr1, sr_base; + unsigned long sdr1; unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i; @@ -399,13 +399,32 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#if defined(__powerpc64__) || defined(CONFIG_PPC_64BITSUPPORT) +#ifdef CONFIG_PPC_64BITSUPPORT + if (is_ppc64()) { +#endif + /* Segment Lookaside Buffer */ + asm volatile("slbia" ::: "memory"); + for (i = 0; i < 16; i++) { + unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7); + unsigned long rb = ((unsigned long)i << 28) | (1 << 27) | i; + asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory"); + } + asm volatile("isync" ::: "memory"); +#ifdef CONFIG_PPC_64BITSUPPORT + } else +#endif +#endif +#ifndef __powerpc64__ /* Segment Register */ - - sr_base = SEGR_USER | SEGR_BASE ; + { + unsigned long sr_base = SEGR_USER | SEGR_BASE ; for( i=0; i<16; i++ ) { int j = i << 28; asm volatile("mtsrin %0,%1" :: "r" (sr_base + i), "r" (j) ); } + } +#endif
ofmem = ofmem_arch_get_private(); memset(ofmem, 0, sizeof(ofmem_t));
On 21.11.2010, at 19:53, Andreas Färber wrote:
Set up SLBs with slbmte instead of mtsrin, suggested by Alex.
v2:
- Don't initialize 64 SLBs, then invalidate them, as in IBM's application note
for the 970. Use slbia instead, recommended by Alex.
- Conditionalize when to use SLB or SR.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/ofmem.c | 25 ++++++++++++++++++++++--- 1 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index 72694b3..24f3a25 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -387,7 +387,7 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem;
- unsigned long sdr1, sr_base;
- unsigned long sdr1; unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i;
@@ -399,13 +399,32 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#if defined(__powerpc64__) || defined(CONFIG_PPC_64BITSUPPORT) +#ifdef CONFIG_PPC_64BITSUPPORT
Phew - too much ifdef for my taste. How about the idea I mentioned in the mail before to just make is_ppc64 return always 0 on ppc32 hosts without compat config option? Then you could also protect the slbia and slbmte and whatever ppc64 specific pieces with #ifdefs but not care about the rest :). Would hopefully make this code a lot more readable!
- if (is_ppc64()) {
+#endif
- /* Segment Lookaside Buffer */
- asm volatile("slbia" ::: "memory");
Inline function please :)
- for (i = 0; i < 16; i++) {
unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7);
unsigned long rb = ((unsigned long)i << 28) | (1 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory");
Inline function again
- }
- asm volatile("isync" ::: "memory");
And this would be awesome to get as inline function too! :)
Alex
Am 21.11.2010 um 20:14 schrieb Alexander Graf:
On 21.11.2010, at 19:53, Andreas Färber wrote:
Set up SLBs with slbmte instead of mtsrin, suggested by Alex.
v2:
- Don't initialize 64 SLBs, then invalidate them, as in IBM's
application note for the 970. Use slbia instead, recommended by Alex.
- Conditionalize when to use SLB or SR.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/ofmem.c | 25 ++++++++++++++++++++++--- 1 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index 72694b3..24f3a25 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -387,7 +387,7 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem;
- unsigned long sdr1, sr_base;
- unsigned long sdr1; unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i;
@@ -399,13 +399,32 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#if defined(__powerpc64__) || defined(CONFIG_PPC_64BITSUPPORT) +#ifdef CONFIG_PPC_64BITSUPPORT
Phew - too much ifdef for my taste. How about the idea I mentioned in the mail before to just make is_ppc64 return always 0 on ppc32 hosts without compat config option? Then you could also protect the slbia and slbmte and whatever ppc64 specific pieces with #ifdefs but not care about the rest :). Would hopefully make this code a lot more readable!
- if (is_ppc64()) {
+#endif
- /* Segment Lookaside Buffer */
- asm volatile("slbia" ::: "memory");
Inline function please :)
- for (i = 0; i < 16; i++) {
unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7);
unsigned long rb = ((unsigned long)i << 28) | (1 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory");
Inline function again
Don't see the advantage here, these two are used nowhere else. But I'll do.
- }
- asm volatile("isync" ::: "memory");
And this would be awesome to get as inline function too! :)
Is this one necessary at all?
Andreas
On 21.11.2010, at 20:19, Andreas Färber wrote:
Am 21.11.2010 um 20:14 schrieb Alexander Graf:
On 21.11.2010, at 19:53, Andreas Färber wrote:
Set up SLBs with slbmte instead of mtsrin, suggested by Alex.
v2:
- Don't initialize 64 SLBs, then invalidate them, as in IBM's application note
for the 970. Use slbia instead, recommended by Alex.
- Conditionalize when to use SLB or SR.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/ofmem.c | 25 ++++++++++++++++++++++--- 1 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index 72694b3..24f3a25 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -387,7 +387,7 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem;
- unsigned long sdr1, sr_base;
- unsigned long sdr1; unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i;
@@ -399,13 +399,32 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#if defined(__powerpc64__) || defined(CONFIG_PPC_64BITSUPPORT) +#ifdef CONFIG_PPC_64BITSUPPORT
Phew - too much ifdef for my taste. How about the idea I mentioned in the mail before to just make is_ppc64 return always 0 on ppc32 hosts without compat config option? Then you could also protect the slbia and slbmte and whatever ppc64 specific pieces with #ifdefs but not care about the rest :). Would hopefully make this code a lot more readable!
- if (is_ppc64()) {
+#endif
- /* Segment Lookaside Buffer */
- asm volatile("slbia" ::: "memory");
Inline function please :)
- for (i = 0; i < 16; i++) {
unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7);
unsigned long rb = ((unsigned long)i << 28) | (1 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory");
Inline function again
Don't see the advantage here, these two are used nowhere else. But I'll do.
Yeah, it just makes the code easier to read. Sorry to make you write more code :). We can also later on do things like
static inline void slbmte(unsigned long rs, unsigned long rb) { #if defined(CONFIG_PPC64) || defined(CONFIG_PPC64_COMPAT) asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory"); #endif }
at which point the assembler doesn't have to know about ppc64 instructions unless we're building to possibly run on ppc64 :).
- }
- asm volatile("isync" ::: "memory");
And this would be awesome to get as inline function too! :)
Is this one necessary at all?
If we're running the code in IR=1 and are modifying any slb entry that might be related to the segment we're running in, then yes.
I suspect we're in real mode here though, so the isync should happen before the mtmsr(mfmsr() | MSR_IR) or through an rfi which would be context synchronizing again :).
Alex
Am 21.11.2010 um 22:15 schrieb Alexander Graf:
On 21.11.2010, at 20:19, Andreas Färber wrote:
Am 21.11.2010 um 20:14 schrieb Alexander Graf:
On 21.11.2010, at 19:53, Andreas Färber wrote:
Set up SLBs with slbmte instead of mtsrin, suggested by Alex.
v2:
- Don't initialize 64 SLBs, then invalidate them, as in IBM's
application note for the 970. Use slbia instead, recommended by Alex.
- Conditionalize when to use SLB or SR.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/ofmem.c | 25 ++++++++++++++++++++++--- 1 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index 72694b3..24f3a25 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -387,7 +387,7 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem;
- unsigned long sdr1, sr_base;
- unsigned long sdr1; unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i;
@@ -399,13 +399,32 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#if defined(__powerpc64__) || defined(CONFIG_PPC_64BITSUPPORT) +#ifdef CONFIG_PPC_64BITSUPPORT
Phew - too much ifdef for my taste. How about the idea I mentioned in the mail before to just make is_ppc64 return always 0 on ppc32 hosts without compat config option? Then you could also protect the slbia and slbmte and whatever ppc64 specific pieces with #ifdefs but not care about the rest :). Would hopefully make this code a lot more readable!
- if (is_ppc64()) {
+#endif
- /* Segment Lookaside Buffer */
- asm volatile("slbia" ::: "memory");
Inline function please :)
- for (i = 0; i < 16; i++) {
unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7);
unsigned long rb = ((unsigned long)i << 28) | (1 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory");
Inline function again
Don't see the advantage here, these two are used nowhere else. But I'll do.
Yeah, it just makes the code easier to read. Sorry to make you write more code :). We can also later on do things like
static inline void slbmte(unsigned long rs, unsigned long rb) { #if defined(CONFIG_PPC64) || defined(CONFIG_PPC64_COMPAT) asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory"); #endif }
at which point the assembler doesn't have to know about ppc64 instructions unless we're building to possibly run on ppc64 :).
- }
- asm volatile("isync" ::: "memory");
And this would be awesome to get as inline function too! :)
Is this one necessary at all?
If we're running the code in IR=1 and are modifying any slb entry that might be related to the segment we're running in, then yes.
I suspect we're in real mode here though, so the isync should happen before the mtmsr(mfmsr() | MSR_IR) or through an rfi which would be context synchronizing again :).
Well, luckily the inline functions no longer allow us to do just that. ;) I'd like to get it out because SDR1 now conflicts with SLB and needs to be rebased against it. Also, I'm waiting for these patches and possibly Mark's to do a reindendatation of ofmem.c.
There's lots of improvements we could generally still do, but I'd rather postpone that until we get to a ppc64 Forth prompt!
Andreas
On 21.11.2010, at 22:30, Andreas Färber wrote:
Am 21.11.2010 um 22:15 schrieb Alexander Graf:
On 21.11.2010, at 20:19, Andreas Färber wrote:
Am 21.11.2010 um 20:14 schrieb Alexander Graf:
On 21.11.2010, at 19:53, Andreas Färber wrote:
Set up SLBs with slbmte instead of mtsrin, suggested by Alex.
v2:
- Don't initialize 64 SLBs, then invalidate them, as in IBM's application note
for the 970. Use slbia instead, recommended by Alex.
- Conditionalize when to use SLB or SR.
Cc: Alexander Graf agraf@suse.de Signed-off-by: Andreas Färber andreas.faerber@web.de
arch/ppc/qemu/ofmem.c | 25 ++++++++++++++++++++++--- 1 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index 72694b3..24f3a25 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -387,7 +387,7 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem;
- unsigned long sdr1, sr_base;
- unsigned long sdr1; unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i;
@@ -399,13 +399,32 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#if defined(__powerpc64__) || defined(CONFIG_PPC_64BITSUPPORT) +#ifdef CONFIG_PPC_64BITSUPPORT
Phew - too much ifdef for my taste. How about the idea I mentioned in the mail before to just make is_ppc64 return always 0 on ppc32 hosts without compat config option? Then you could also protect the slbia and slbmte and whatever ppc64 specific pieces with #ifdefs but not care about the rest :). Would hopefully make this code a lot more readable!
- if (is_ppc64()) {
+#endif
- /* Segment Lookaside Buffer */
- asm volatile("slbia" ::: "memory");
Inline function please :)
- for (i = 0; i < 16; i++) {
unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7);
unsigned long rb = ((unsigned long)i << 28) | (1 << 27) | i;
asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory");
Inline function again
Don't see the advantage here, these two are used nowhere else. But I'll do.
Yeah, it just makes the code easier to read. Sorry to make you write more code :). We can also later on do things like
static inline void slbmte(unsigned long rs, unsigned long rb) { #if defined(CONFIG_PPC64) || defined(CONFIG_PPC64_COMPAT) asm volatile("slbmte %0,%1" :: "r" (rs), "r" (rb) : "memory"); #endif }
at which point the assembler doesn't have to know about ppc64 instructions unless we're building to possibly run on ppc64 :).
- }
- asm volatile("isync" ::: "memory");
And this would be awesome to get as inline function too! :)
Is this one necessary at all?
If we're running the code in IR=1 and are modifying any slb entry that might be related to the segment we're running in, then yes.
I suspect we're in real mode here though, so the isync should happen before the mtmsr(mfmsr() | MSR_IR) or through an rfi which would be context synchronizing again :).
Well, luckily the inline functions no longer allow us to do just that. ;) I'd like to get it out because SDR1 now conflicts with SLB and needs to be rebased against it. Also, I'm waiting for these patches and possibly Mark's to do a reindendatation of ofmem.c.
There's lots of improvements we could generally still do, but I'd rather postpone that until we get to a ppc64 Forth prompt!
Yeah, as long as you're running in emulation or kvm, you're safe either way. No need for isync there. On real hardware, it's good to have it around.
Alex
+#if defined(__powerpc64__) || defined(CONFIG_PPC_64BITSUPPORT) +#ifdef CONFIG_PPC_64BITSUPPORT
Phew - too much ifdef for my taste. How about the idea I mentioned in the mail before to just make is_ppc64 return always 0 on ppc32 hosts without compat config option? Then you could also protect the slbia and slbmte and whatever ppc64 specific pieces with #ifdefs but not care about the rest :). Would hopefully make this code a lot more readable!
Yeah. Factor out most stuff so you have a single #ifdef for them :-)
- asm volatile("isync" ::: "memory");
And this would be awesome to get as inline function too! :)
Too bad that won't work. It will not work as written either -- the asm can still be moved around relative to some stuff.
Put the isync in the same asm that it is syncing for.
Oh, btw, slbia does not invalidate all SLBs (it doesn't invalidate #0). It probably doesn't matter here, since you write #0 soon enough, and you cannot have an exception happen before that. Also, the slbia does invalidate the ERATs. But you might want to check this, and/or add a comment.
Segher
Am 21.11.2010 um 20:24 schrieb Segher Boessenkool:
+#if defined(__powerpc64__) || defined(CONFIG_PPC_64BITSUPPORT) +#ifdef CONFIG_PPC_64BITSUPPORT
Phew - too much ifdef for my taste. How about the idea I mentioned in the mail before to just make is_ppc64 return always 0 on ppc32 hosts without compat config option? Then you could also protect the slbia and slbmte and whatever ppc64 specific pieces with #ifdefs but not care about the rest :). Would hopefully make this code a lot more readable!
Yeah. Factor out most stuff so you have a single #ifdef for them :-)
Actually this patch is designed for the mid-term goal of dropping legacy 64-bit support, so any CONFIG_PPC_64BITSUPPORT code path is supposed to go away. Creating new defines that can't just be stripped in order to drop it would be bad.
- asm volatile("isync" ::: "memory");
And this would be awesome to get as inline function too! :)
Too bad that won't work. It will not work as written either -- the asm can still be moved around relative to some stuff.
Put the isync in the same asm that it is syncing for.
You mean, unroll the loop into one asm volatile() or do the isync in every iteration? (Are you implying it is necessary here?)
Oh, btw, slbia does not invalidate all SLBs (it doesn't invalidate #0). It probably doesn't matter here, since you write #0 soon enough, and you cannot have an exception happen before that. Also, the slbia does invalidate the ERATs. But you might want to check this, and/or add a comment.
We did think of the initial SLB, and my loop starts at #0. Never heard of ERATs before, are they bad for us or just an info?
Andreas
Set up SLBs with slbmte instead of mtsrin, suggested by Alex.
v3: * Continue to use mtmsrin on ppc for simplicity. * Add comment on slbia, suggested by Segher. * Add inline functions {slbia,slbmte}, requested by Alex. * Add inline function mfpvr before Alex asks for it. :)
v2: * Don't initialize 64 SLBs, then invalidate them, as in IBM's application note for the 970. Use slbia instead, recommended by Alex. * Conditionalize when to use SLB or SR.
Cc: Alexander Graf agraf@suse.de Cc: Segher Boessenkool segher@kernel.crashing.org Signed-off-by: Andreas Färber andreas.faerber@web.de --- arch/ppc/qemu/ofmem.c | 24 ++++++++++++++++++++---- include/arch/ppc/processor.h | 17 +++++++++++++++++ 2 files changed, 37 insertions(+), 4 deletions(-)
diff --git a/arch/ppc/qemu/ofmem.c b/arch/ppc/qemu/ofmem.c index 72694b3..e871623 100644 --- a/arch/ppc/qemu/ofmem.c +++ b/arch/ppc/qemu/ofmem.c @@ -336,9 +336,7 @@ hash_page_32( ucell ea, ucell phys, ucell mode )
static int is_ppc64(void) { - unsigned int pvr; - asm volatile("mfspr %0, 0x11f" : "=r" (pvr) ); - + unsigned int pvr = mfpvr(); return ((pvr >= 0x330000) && (pvr < 0x70330000)); }
@@ -387,7 +385,10 @@ void setup_mmu( unsigned long ramsize ) { ofmem_t *ofmem; - unsigned long sdr1, sr_base; + unsigned long sdr1; +#ifndef __powerpc64__ + unsigned long sr_base; +#endif unsigned long hash_base; unsigned long hash_mask = 0xfff00000; /* alignment for ppc64 */ int i; @@ -399,6 +400,19 @@ setup_mmu( unsigned long ramsize ) sdr1 = hash_base | ((HASH_SIZE-1) >> 16); asm volatile("mtsdr1 %0" :: "r" (sdr1) );
+#ifdef __powerpc64__ + + /* Segment Lookaside Buffer */ + + slbia(); /* Invalidate all SLBs except SLB 0 */ + for (i = 0; i < 16; i++) { + unsigned long rs = ((0x400 + i) << 12) | (0x10 << 7); + unsigned long rb = ((unsigned long)i << 28) | (1 << 27) | i; + slbmte(rs, rb); + } + +#else + /* Segment Register */
sr_base = SEGR_USER | SEGR_BASE ; @@ -407,6 +421,8 @@ setup_mmu( unsigned long ramsize ) asm volatile("mtsrin %0,%1" :: "r" (sr_base + i), "r" (j) ); }
+#endif + ofmem = ofmem_arch_get_private(); memset(ofmem, 0, sizeof(ofmem_t)); ofmem->ramsize = ramsize; diff --git a/include/arch/ppc/processor.h b/include/arch/ppc/processor.h index c7d5be6..c2f1284 100644 --- a/include/arch/ppc/processor.h +++ b/include/arch/ppc/processor.h @@ -425,6 +425,23 @@ static inline void mtmsr(unsigned long msr) #endif }
+static inline unsigned int mfpvr(void) +{ + unsigned int pvr; + asm volatile("mfspr %0, 0x11f" : "=r" (pvr) ); + return pvr; +} + +static inline void slbia(void) +{ + asm volatile("slbia" ::: "memory"); +} + +static inline void slbmte(unsigned long rs, unsigned long rb) +{ + asm volatile("slbmte %0,%1 ; isync" :: "r" (rs), "r" (rb) : "memory"); +} + #endif /* !__ASSEMBLER__ */
#endif /* _H_PROCESSOR */
On 21.11.2010, at 21:51, Andreas Färber wrote:
Set up SLBs with slbmte instead of mtsrin, suggested by Alex.
v3:
- Continue to use mtmsrin on ppc for simplicity.
- Add comment on slbia, suggested by Segher.
- Add inline functions {slbia,slbmte}, requested by Alex.
- Add inline function mfpvr before Alex asks for it. :)
v2:
- Don't initialize 64 SLBs, then invalidate them, as in IBM's application note
for the 970. Use slbia instead, recommended by Alex.
- Conditionalize when to use SLB or SR.
Cc: Alexander Graf agraf@suse.de Cc: Segher Boessenkool segher@kernel.crashing.org Signed-off-by: Andreas Färber andreas.faerber@web.de
Not tested, but if it works:
Signed-off-by: Alexander Graf agraf@suse.de
Alex
Am 21.11.2010 um 22:24 schrieb Alexander Graf:
On 21.11.2010, at 21:51, Andreas Färber wrote:
Set up SLBs with slbmte instead of mtsrin, suggested by Alex.
v3:
- Continue to use mtmsrin on ppc for simplicity.
- Add comment on slbia, suggested by Segher.
- Add inline functions {slbia,slbmte}, requested by Alex.
- Add inline function mfpvr before Alex asks for it. :)
v2:
- Don't initialize 64 SLBs, then invalidate them, as in IBM's
application note for the 970. Use slbia instead, recommended by Alex.
- Conditionalize when to use SLB or SR.
Cc: Alexander Graf agraf@suse.de Cc: Segher Boessenkool segher@kernel.crashing.org Signed-off-by: Andreas Färber andreas.faerber@web.de
Not tested, but if it works:
Signed-off-by: Alexander Graf agraf@suse.de
It did before removing the #ifdef'ery. In v3 ppc continues to work.
Thanks, applied as r966.
Andreas
Actually this patch is designed for the mid-term goal of dropping legacy 64-bit support, so any CONFIG_PPC_64BITSUPPORT code path is supposed to go away. Creating new defines that can't just be stripped in order to drop it would be bad.
I'm not saying you should do it right now; just that it seems to be quickly becoming a big mess of #ifdefs.
- asm volatile("isync" ::: "memory");
And this would be awesome to get as inline function too! :)
Too bad that won't work. It will not work as written either -- the asm can still be moved around relative to some stuff.
Put the isync in the same asm that it is syncing for.
You mean, unroll the loop into one asm volatile() or do the isync in every iteration? (Are you implying it is necessary here?)
You can do it in every iteration, it's not like it will really hurt performance. I haven't looked to see if it is necessary at all.
Oh, btw, slbia does not invalidate all SLBs (it doesn't invalidate #0). It probably doesn't matter here, since you write #0 soon enough, and you cannot have an exception happen before that. Also, the slbia does invalidate the ERATs. But you might want to check this, and/or add a comment.
We did think of the initial SLB, and my loop starts at #0. Never heard of ERATs before, are they bad for us or just an info?
ERATs are an implementation detail. The CPU only ever translates effective addresses to real addresses in one step: by looking it up in the ERATs. If that misses, it uses the SLBs and TLBs and the HTAB.
So, the ERATs need to be invalidated when you change some translation; most SLB and TLB insns do that. You never need to worry about the ERAT if you "simply" follow all rules in the architecture.
Segher
On Sun, Nov 7, 2010 at 10:13 PM, Andreas Färber andreas.faerber@web.de wrote:
Am 01.11.2010 um 17:36 schrieb Andreas Färber:
Latest state with local patches is that hell breaks loose once the MMU is set up. I get a 0x400 (ISI) exception and when the bctrl to isi_exception() is executed, we end up at trap_error, where it branches to unexpected_excep() and tries to printk() to the serial port that's not yet set up. I'll put a few patches together.
Since r945 everything except for the trampoline issue should be in SVN.
For that, I'd suggest adding an explicit initialization function (forth_init()) to set up the trampoline. This would be called from arch/*/openbios.c etc.
On Mon, Nov 1, 2010 at 12:33 AM, Andreas Färber andreas.faerber@web.de wrote:
Am 08.10.2010 um 15:30 schrieb Alexander Graf:
You can use the apple gdb without an object file, so you don't get symbols. But if you have an instruction pointer, just
$ qemu-system-ppc -s -S ... (gdb) target remote localhost:1234 (gdb) b *0x1234 <- address of rtas_something (gdb) c
It should break on that IP and then you can evaluate the register contents at least. Either by
(gdb) info registers
or
(qemu) info registers
I'm trying to find out how far we get with the ppc64 OpenBIOS, so I've tried the following:
$ .../ppc64-softmmu/qemu-system-ppc64 ... -nographic -prom-env 'auto-boot?=false' -s -S
$ gdb --arch=ppc64 GNU gdb 6.3.50-20050815 (Apple version gdb-967) (Tue Jul 14 02:15:14 UTC 2009) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "powerpc-apple-darwin". (gdb) target remote localhost:1234 Remote debugging using localhost:1234 [New thread 1] 0x0000000000000000 in ?? () (gdb) b *0xfffffffc Breakpoint 1 at 0xfffffffc (gdb) c Continuing.
It doesn't break though and executes to the OpenBIOS prompt. 0xfffffffc is supposed to be the hard reset vector, i.e. the very first instruction it must execute to branch to _entry.
Any suggestions?
Maybe try to do "stepi" instead of "c" to prove execution starts from 0xfffffffc? Breakpoints are unreliable in qemu. Particularly traps are really hard to debug (at least in the sparc port).