L2 Cache Settings

List overview All Threads
Download

newer

older

Apple's BootX

Re: [OpenBIOS] Apple's BootX

James Lyons

22 Jan 2018 22 Jan '18

11:50 p.m.

Looks like for the 7447a we set SPR 1017( L2CR ) bit 0 ( L2E ) or just write 0x8000000 to the reg.

Not sure how we do this in Openbios.

Show replies by date

Programmingkid

23 Jan 23 Jan

12:35 a.m.

...

On Jan 22, 2018, at 9:50 PM, James Lyons via OpenBIOS openbios@openbios.org wrote:

Looks like for the 7447a we set SPR 1017( L2CR ) bit 0 ( L2E ) or just write 0x8000000 to the reg.

Not sure how we do this in Openbios.

This would be done in PowerPC assembly. A C function can contain the inline assembly code.

It would look something like this:

void setup_PPC_7447a(void) { asm volatile("addis r10, 0, 0x8000"); asm volatile("mtspr 1017, r10"); }

Segher Boessenkool

6:57 p.m.

On Mon, Jan 22, 2018 at 10:35:17PM -0500, Programmingkid wrote:

...

...
On Jan 22, 2018, at 9:50 PM, James Lyons via OpenBIOS openbios@openbios.org wrote:

Looks like for the 7447a we set SPR 1017( L2CR ) bit 0 ( L2E ) or just write 0x8000000 to the reg.

Not sure how we do this in Openbios.

This would be done in PowerPC assembly. A C function can contain the inline assembly code.

It would look something like this:

void setup_PPC_7447a(void) { asm volatile("addis r10, 0, 0x8000"); asm volatile("mtspr 1017, r10"); }

That is incorrect asm. You cannot use r10, it can already be in use, and you cannot pass a reg between two separate asm statements like this.

You meant something like

void enable_L2_7447A(void) { asm("mtspr %0,%1" : : "n"(1017), "r"(0x80000000)); }

(but see my other mail, this is not enough to enable L2).

Segher

Programmingkid

24 Jan 24 Jan

11:47 a.m.

...

On Jan 23, 2018, at 4:57 PM, Segher Boessenkool segher@kernel.crashing.org wrote:

On Mon, Jan 22, 2018 at 10:35:17PM -0500, Programmingkid wrote:

...
...
On Jan 22, 2018, at 9:50 PM, James Lyons via OpenBIOS openbios@openbios.org wrote:

Looks like for the 7447a we set SPR 1017( L2CR ) bit 0 ( L2E ) or just write 0x8000000 to the reg.

Not sure how we do this in Openbios.

This would be done in PowerPC assembly. A C function can contain the inline assembly code.

It would look something like this:

void setup_PPC_7447a(void) { asm volatile("addis r10, 0, 0x8000"); asm volatile("mtspr 1017, r10"); }

That is incorrect asm. You cannot use r10, it can already be in use, and you cannot pass a reg between two separate asm statements like this.

You meant something like

void enable_L2_7447A(void) { asm("mtspr %0,%1" : : "n"(1017), "r"(0x80000000)); }

I would change "asm" to "asm volatile" to prevent the compiler from optimizing this code out.

Segher Boessenkool

3:21 p.m.

On Wed, Jan 24, 2018 at 09:47:13AM -0500, Programmingkid wrote:

...

...
On Jan 23, 2018, at 4:57 PM, Segher Boessenkool segher@kernel.crashing.org wrote: You meant something like

void enable_L2_7447A(void) { asm("mtspr %0,%1" : : "n"(1017), "r"(0x80000000)); }

I would change "asm" to "asm volatile" to prevent the compiler from optimizing this code out.

The asm has no outputs so it is already volatile.

Segher

Jd Lyons

25 Jan 25 Jan

1:09 p.m.

...

On Jan 24, 2018, at 1:21 PM, Segher Boessenkool segher@kernel.crashing.org wrote:

On Wed, Jan 24, 2018 at 09:47:13AM -0500, Programmingkid wrote:

...
...
On Jan 23, 2018, at 4:57 PM, Segher Boessenkool segher@kernel.crashing.org wrote: You meant something like

void enable_L2_7447A(void) { asm("mtspr %0,%1" : : "n"(1017), "r"(0x80000000)); }

I would change "asm" to "asm volatile" to prevent the compiler from optimizing this code out.

The asm has no outputs so it is already volatile.

Segher

-- OpenBIOS http://openbios.org/ Mailinglist: http://lists.openbios.org/mailman/listinfo Free your System - May the Forth be with you

Segher, if you have some of the old White Papers on the CPU’s that shipped in Mac’s, or the upgrades offered by third parties, I’d like to get a look at them, if your not under NDA.

Would be interesting to know what instructions each supports, and the L2 L3 cache settings and info.

It’s not clear to me what qemu-system-ppc emulates, surely the L1 cache.

Obviously, Open firmware has a entry in the device tree for L2 cache, when I have my Quicksilver I’ll be able to report the L3 cache settings and info in Open Firmware.

So if we can figure out a graceful way of adding a correct entry on the device tree in Open Bios for the L2/L3 cache, that would be optimal.

Segher Boessenkool

3:38 p.m.

On Thu, Jan 25, 2018 at 11:09:31AM -0500, Jd Lyons wrote:

...

Segher, if you have some of the old White Papers on the CPU’s that shipped in Mac’s, or the upgrades offered by third parties, I’d like to get a look at them, if your not under NDA.

You can download this CPU documentation from NPX (who bought it from FSL, and before it was Motorola). Those are good docs.

...

Would be interesting to know what instructions each supports, and the L2 L3 cache settings and info.

External cache settings needed depend on the board used, the exact cache chips used, etc.

...

Obviously, Open firmware has a entry in the device tree for L2 cache, when I have my Quicksilver I’ll be able to report the L3 cache settings and info in Open Firmware.

L3 is represented exactly like L2 in the device tree. Settings needed depend on the system. For internal caches there usually isn't very much to configure, so that is much easier. 7447A has internal L2 and cannot have L3. But for example older 750 has external L2, and you can do a whole bunch of settings there (overclock the L2, etc. :-) )

Segher

Jd Lyons

26 Jan 26 Jan

7:04 a.m.

...

On Jan 25, 2018, at 1:38 PM, Segher Boessenkool segher@kernel.crashing.org wrote:

On Thu, Jan 25, 2018 at 11:09:31AM -0500, Jd Lyons wrote:

...
Segher, if you have some of the old White Papers on the CPU’s that shipped in Mac’s, or the upgrades offered by third parties, I’d like to get a look at them, if your not under NDA.

You can download this CPU documentation from NPX (who bought it from FSL, and before it was Motorola). Those are good docs.

Thanks, I did a few half hearted google searches, that didn’t yield the docs, I figured they we still around somewhere. Sometimes goole is obtuse, it’s like the computer from the Hitchhikers Guide, it matters how you ask the question.

...

...
Would be interesting to know what instructions each supports, and the L2 L3 cache settings and info.

External cache settings needed depend on the board used, the exact cache chips used, etc.

...
Obviously, Open firmware has a entry in the device tree for L2 cache, when I have my Quicksilver I’ll be able to report the L3 cache settings and info in Open Firmware.

L3 is represented exactly like L2 in the device tree. Settings needed depend on the system. For internal caches there usually isn't very much to configure, so that is much easier. 7447A has internal L2 and cannot have L3. But for example older 750 has external L2, and you can do a whole bunch of settings there (overclock the L2, etc. :-) )

Segher

I wonder, in specific to emulating caches if that yields any performance increase. Just as a layman, I have a basic understanding of cpu’s and some understanding of caches. I remember having an old Powermac 8600 with a Sonnet G3 450, having the L2 cache enabled and overclocking it within the constraints of the ram used for the external cache could yield some very worthwhile performance increase.

I’m collecting some of these old PPC Mac’s while they are dirt cheap, before people just toss the on the scrap heap because no one want to pay the shipping, mainly to try and make qemu a better ppc emulator, as no matter how well built some of this old hardware was, it will all fail given enough time and is limited to the physical constraints of the components used. Of course the goal for me is running old software, things like the earlier versions of iMovie that weren’t so complex as to confuse the user with feature bloat.

With an emulated cpu, we’re only really limited to how well the code is written to take advantage of the host cpu. 32 bit PPC is never going to get any faster, baring as unforeseen reality, but x86 and arm continue to push Moore’s Law, so running PPC in emulation will continue to see performance increases for software that is limited to clock cycles.

With qemu-ppc I find the integer performance very well maintained, it seems to scale well with the host cpu, tho the FPU and the Vec units are woeful and need a lot of work. Sheepshaver hands down outperforms qemu in FPU and Vec calculations, tho the code just became unmaintainable to the developers, or so I’ve been told. I’ve looked at it, to see if any of the code could be ported to qemu to help increase FPU and Vec and the code seems spread all over the place, and it’s hard to figure how all the pieces fit together.

It seems like it could be a challenge to emulate caches, I’ll have to dig deeper into qemu and see how the issue is dealt with.

Segher Boessenkool

10 p.m.

On Fri, Jan 26, 2018 at 05:04:02AM -0500, Jd Lyons wrote:

...

...
On Jan 25, 2018, at 1:38 PM, Segher Boessenkool segher@kernel.crashing.org wrote:

On Thu, Jan 25, 2018 at 11:09:31AM -0500, Jd Lyons wrote:

...
Segher, if you have some of the old White Papers on the CPU’s that shipped in Mac’s, or the upgrades offered by third parties, I’d like to get a look at them, if your not under NDA.

You can download this CPU documentation from NPX (who bought it from FSL, and before it was Motorola). Those are good docs.

Thanks, I did a few half hearted google searches, that didn’t yield the docs, I figured they we still around somewhere. Sometimes goole is obtuse, it’s like the computer from the Hitchhikers Guide, it matters how you ask the question.

(It probably does not help that is mistyped NXP).

7447a filetype:pdf is an easy way to find it (you want the "reference manual" most).

...

I wonder, in specific to emulating caches if that yields any performance increase.

Probably not. Your host's memory access do not get any faster, and the increased bookkeeping will not help.

...

32 bit PPC is never going to get any faster,

Most (all?) 64-bit PowerPC cores can run 32-bit code just fine.

...

With qemu-ppc I find the integer performance very well maintained, it seems to scale well with the host cpu, tho the FPU and the Vec units are woeful and need a lot of work.

The FPU can not be cheaply emulated at all on x86 (probably not on Arm either, not sure). The most obvious thing that cannot be cheaply emulated is fmadd (fused multiply add).

Some vector operations are hard to do, too (in addition to the float ones).

...

It seems like it could be a challenge to emulate caches, I’ll have to dig deeper into qemu and see how the issue is dealt with.

Well, it probably should at least make the L2CR etc. registers work as expected; they do not necessarily actually have to *do* anything, esp. not if you only want to emulate the architectural state (so don't care about for example emulating cache invalidate correctly where that is undefined behaviour in the architecture, etc.)

Segher

Jd Lyons

30 Jan 30 Jan

10:03 a.m.

...

On Jan 26, 2018, at 8:00 PM, Segher Boessenkool segher@kernel.crashing.org wrote:

On Fri, Jan 26, 2018 at 05:04:02AM -0500, Jd Lyons wrote:

...
...
On Jan 25, 2018, at 1:38 PM, Segher Boessenkool segher@kernel.crashing.org wrote:

On Thu, Jan 25, 2018 at 11:09:31AM -0500, Jd Lyons wrote:

...
Segher, if you have some of the old White Papers on the CPU’s that shipped in Mac’s, or the upgrades offered by third parties, I’d like to get a look at them, if your not under NDA.

You can download this CPU documentation from NPX (who bought it from FSL, and before it was Motorola). Those are good docs.

Thanks, I did a few half hearted google searches, that didn’t yield the docs, I figured they we still around somewhere. Sometimes goole is obtuse, it’s like the computer from the Hitchhikers Guide, it matters how you ask the question.

(It probably does not help that is mistyped NXP).

7447a filetype:pdf is an easy way to find it (you want the "reference manual" most).

...
I wonder, in specific to emulating caches if that yields any performance increase.

Probably not. Your host's memory access do not get any faster, and the increased bookkeeping will not help.

...
32 bit PPC is never going to get any faster,

Most (all?) 64-bit PowerPC cores can run 32-bit code just fine.

...
With qemu-ppc I find the integer performance very well maintained, it seems to scale well with the host cpu, tho the FPU and the Vec units are woeful and need a lot of work.

The FPU can not be cheaply emulated at all on x86 (probably not on Arm either, not sure). The most obvious thing that cannot be cheaply emulated is fmadd (fused multiply add).

Some vector operations are hard to do, too (in addition to the float ones).

...
It seems like it could be a challenge to emulate caches, I’ll have to dig deeper into qemu and see how the issue is dealt with.

Well, it probably should at least make the L2CR etc. registers work as expected; they do not necessarily actually have to *do* anything, esp. not if you only want to emulate the architectural state (so don't care about for example emulating cache invalidate correctly where that is undefined behaviour in the architecture, etc.)

Segher

Seems I’m back to this, for some reason BootX or mach_kernel is:

...

When I try to boot OS X in qemu-system-ppc with -enable-kvm -cpu host Apple’s BootX hangs at Call Kernel!

I’m using a PowerBook6,8 with a 7447a v1.5 and Kernel 4.4.111.

When BootX hangs, I get this in dmesg:

kvmppc_handle_exit_pr: emulation at 92808 failed ( 7dbafaa6 )

7dbaffa6 is mfspr r13,1018, and SPR 1018 on the 7447A is L3CR, the level 3 cache control register. The boot code is presumably wanting to do something to the L3 cache configuration.

...

Can anyone help explain what this error means, so I can figure out what it triggering it, and hopefully fix the issue.

You could try adding code to kvmppc_core_emulate_mfspr_pr() and kvmppc_core_emulate_mtspr_pr() to implement it as a dummy SPR that ignores writes and returns zero when read. That should work unless the BootX code does something like waiting for some bit to come back as a 1.

Paul.

I’m not really sure, meaning I have no idea to do what Paul is suggesting, but I assume it means recompiling my kernel, as I didn’t build KVM as a module. Even then, I just don’t know how to do create a dummy SPR.

I was hoping I could take a shortcut, with some Openbios hacks.

To start with I want to create a child to the CPU, the L2 cache, and see if that is enough to get OS X to stop trying to access an L3 cache that doesn’t exist. I checked the L3CR, and it is, of course not writable.

So the properties of the L2 I need to add are:

Under the CPU:

l2-cache /cpus/PowerPC,G4@0/l2-cache l2cr 80000000

I tried:

dev /cpus/@0 “ /cpus/PowerPC,G4@0/l2-cache” encode-string “ l2-cache” property 80000000 encode-int “ l2cr” property

Segher, this doesn’t seem quite right, as it does create the l2-cache property, but it’s in quotes. I think what I need to do here is create and alias rather than a string?

The l2cr property seems correct.

The I need to create a child to the CPU, but I’m not sure how to do that, Segher?

The child needs to have these properties:

name l2-cache device-type cache i-cache-size 00080000 d-cache-size 00080000 i-cache-sets 00000200 d-cache-sets 00000200 i-cache-line-size 00000040 d-cache-line-size 00000040 cache-unified clock-frequency 59682efa

I’m pretty sure I know how to create these properties, I just don’t know how to create the l2-cache node as a child to the cpu@0.

Programmingkid

25 Jan 25 Jan

3:45 p.m.

...

On Jan 25, 2018, at 11:09 AM, Jd Lyons via OpenBIOS openbios@openbios.org wrote:

...
On Jan 24, 2018, at 1:21 PM, Segher Boessenkool segher@kernel.crashing.org wrote:

On Wed, Jan 24, 2018 at 09:47:13AM -0500, Programmingkid wrote:

...
...
On Jan 23, 2018, at 4:57 PM, Segher Boessenkool segher@kernel.crashing.org wrote: You meant something like

void enable_L2_7447A(void) { asm("mtspr %0,%1" : : "n"(1017), "r"(0x80000000)); }

I would change "asm" to "asm volatile" to prevent the compiler from optimizing this code out.

The asm has no outputs so it is already volatile.

Segher

-- OpenBIOS http://openbios.org/ Mailinglist: http://lists.openbios.org/mailman/listinfo Free your System - May the Forth be with you

Segher, if you have some of the old White Papers on the CPU’s that shipped in Mac’s, or the upgrades offered by third parties, I’d like to get a look at them, if your not under NDA.

This pdf has detailed information on the 7447A https://www.nxp.com/docs/en/data-sheet/MPC7447AEC.pdf

This has a ton of information on the 7450 (probably still useful to you) https://www.nxp.com/docs/en/reference-manual/MPC7450UM.pdf

This has information on programming PowerPC processors: https://www.nxp.com/docs/en/reference-manual/MPCFPE32B.pdf

Jd Lyons

26 Jan 26 Jan

7:05 a.m.

...

On Jan 25, 2018, at 1:45 PM, Programmingkid programmingkidx@gmail.com wrote:

...
On Jan 25, 2018, at 11:09 AM, Jd Lyons via OpenBIOS openbios@openbios.org wrote:

...
On Jan 24, 2018, at 1:21 PM, Segher Boessenkool segher@kernel.crashing.org wrote:

On Wed, Jan 24, 2018 at 09:47:13AM -0500, Programmingkid wrote:

...
...
On Jan 23, 2018, at 4:57 PM, Segher Boessenkool segher@kernel.crashing.org wrote: You meant something like

void enable_L2_7447A(void) { asm("mtspr %0,%1" : : "n"(1017), "r"(0x80000000)); }

I would change "asm" to "asm volatile" to prevent the compiler from optimizing this code out.

The asm has no outputs so it is already volatile.

Segher

-- OpenBIOS http://openbios.org/ Mailinglist: http://lists.openbios.org/mailman/listinfo Free your System - May the Forth be with you

Segher, if you have some of the old White Papers on the CPU’s that shipped in Mac’s, or the upgrades offered by third parties, I’d like to get a look at them, if your not under NDA.

This pdf has detailed information on the 7447A https://www.nxp.com/docs/en/data-sheet/MPC7447AEC.pdf https://www.nxp.com/docs/en/data-sheet/MPC7447AEC.pdf

This has a ton of information on the 7450 (probably still useful to you) https://www.nxp.com/docs/en/reference-manual/MPC7450UM.pdf https://www.nxp.com/docs/en/reference-manual/MPC7450UM.pdf

This has information on programming PowerPC processors: https://www.nxp.com/docs/en/reference-manual/MPCFPE32B.pdf https://www.nxp.com/docs/en/reference-manual/MPCFPE32B.pdf

Thanks PK, as always, you seem to be able to find what I’m looking for;-)

Segher Boessenkool

23 Jan 23 Jan

6:53 p.m.

On Mon, Jan 22, 2018 at 09:50:57PM -0500, James Lyons via OpenBIOS wrote:

...

Looks like for the 7447a we set SPR 1017( L2CR ) bit 0 ( L2E ) or just write 0x8000000 to the reg.

One more zero (0x80000000). And it is not enough. You need at least something like this (in Forth code):

\ This assumes we run after hard reset; if not, more work is needed.

\ l2cr@ ( -- x ) mfspr x,1017 \ l2cr~ ( x -- ) mtspr 1017,x

hex

: init-L2-7447A 40200000 l2cr! \ enable L2 parity, initiate invalidate BEGIN l2cr@ 00200000 and 0= UNTIL \ wait for invalidate done c0200000 l2cr! ; \ enable L2

Segher

Jd Lyons

30 Jan 30 Jan

11:06 a.m.

...

On Jan 23, 2018, at 4:53 PM, Segher Boessenkool segher@kernel.crashing.org wrote:

\ This assumes we run after hard reset; if not, more work is needed.

\ l2cr@ ( -- x ) mfspr x,1017 \ l2cr~ ( x -- ) mtspr 1017,x

hex

: init-L2-7447A 40200000 l2cr! \ enable L2 parity, initiate invalidate BEGIN l2cr@ 00200000 and 0= UNTIL \ wait for invalidate done c0200000 l2cr! ; \ enable L2

Ok, I’m not sure how to use this?

2639

days inactive

2646

days old

openbios@openbios.org

13 comments

4 participants

tags (0)

participants (4)

James Lyons
Jd Lyons
Programmingkid
Segher Boessenkool