On Thu, 6 Mar 2014, Mark Cave-Ayland wrote:
On 06/03/14 18:07, BALATON Zoltan wrote:
Hello,
On Thu, 6 Mar 2014, laire@t-online.de wrote:
The standard behavior for the PPC OF boot process is disabled memory management when it starts the boot image. The system take over is always a sensitive thing but there is no reason for an OF to use the powerpc exception vectors in that early phase for what functions MorphOS is using.
Thanks for your reply. I could prevent the crash due to writing to memory address 0x80 by moving the exception handling code of openbios to somewhere else but unfortunately openbios is using the DSI and ISI exceptions and it seems it cannot work without them. I have found this document: http://www.openfirmware.org/1275/bindings/ppc/release/ppc-2_1.html that discusses memory management issues between OF and clients on PPC but I don't see what would be the correct behaviour.
I tried disabling these bits in openbios before staring the boot executable but that makes it freeze the first time it calls an openbios callback and I don't know how to change it to not rely on working memory management for client callbacks. However, if I leave these exceptions enabled and only disable them when the MorphOS boot tries to install its own handlers it seems to work all right and get past this point. So I think if MorphOS cleared the DSI and ISI bits before touching the vectors just in case, it would get further on QEMU while this should not matter on real hardware where these bits are supposedly not enabled anyway.
Hi all,
My reading of the link above is that it shouldn't really matter if real-mode? is set to true or false, although it just so happens that that the default is true on Apple's PPC implementation. Does MorphOS boot if real-mode? is set to true on real hardware?
I think you meant that the default is false as seen here in this dump: http://johannes.sipsolutions.net/PowerBook/openfirmware-device-tree According to the ppc binding document I've quoted before it means that the default is virtual mode in which OF will establish memory management during device initalisation and after that it's the client's responsibility to handle it and provide callbacks to OF for memory management. (This is discussed in section 4.2.6.) However it is not clear to me what is the supposed behaviour before the client takes over memory management. It says: "When a client executes set-callback, Open Firmware shall attempt to invoke the >>translate<< callback. If the translate callback is implemented, Open Firmware shall cease use of address translation hardware, instead using the client callbacks for changes to address translation." From this I think OF is allowed to use its own vectors before the client takes over. Then: "A client program shall not directly manipulate any address translation hardware before it either a) ceases to invoke OF client services or b) issues a set-callback to install the >>translate<< callback." See below on how this applies to MorphOS.
On startup OpenBIOS maps the entire RAM 1:1 physical to virtual, so I would have thought that this shouldn't be an issue. All other OSs I've seen under OpenBIOS reads the translations property from the CPU in order to build up any existing mappings before installing their own memory handlers, so it's fairly standard practice and I wouldn't expect MorphOS to be any different.
Additional mappings can be requested via CIF as the loader requires, but of course the loader would assume that anything it hasn't explicitly requested hasn't been mapped so would avoid these regions.
Can you obtain the fault address at all? My feeling is that this is a symptom of something else having already gone wrong a while back, and the stack corruption/exception is another symptom of this.
I think I have a fairly good understanding what is happening but I don't know how to fix it in openbios. The first exception is happening because MorphOS unconditionally writes 0 to 0x80 in its start routine. This seems to be the same start routine that is later copied to the reset vector and this write may be something that is needed during a reboot for MorphOS but it is also done at first boot when that address is not managed by MorphOS yet. Openbios had part of its exception handling routine there which was overwritten and hence this lead to an invalid opcode exception. This is probably in violation of the standard but happens to work on hardware with other OF implementations and I can change openbios to not use this address so it works there too. The move_stubs patch achieves this.
MorphOS does not set a callback but seems to call OF functions before starting to take over memory management and it probably only calls exit after that which may work according to the above document. During MMU take over what happens though is this: First MorphOS copies some code to 0x0000-0x1fff without first disabling the memory management interrupts. (Maybe it expects them to be off until enabled.) The code it copies is not correct yet as it has jumps pointing to somewhere near the end of ROM area. Then it fixes up these jumps with addresses pointing to its current location at its load address (it may change this back later again but I have not got there yet). Then it enables the MMU bits in the MSR. Surely enough on QEMU it hits a DSI exception just after it copied its vectors but before it fixed them up which leads to a jump to somewhere in ROM and eventually hits an invalid opcode again. This does not happen if I disable the MMU bits in MSR when the vectors are first copied and let MorphOS enable them later after it fixed the vectors up. I think this either again just works by chance on real hardware or the OF implementations there disable these bits when calling the boot loader and do not need working MMU for client functions. It could be verified by checking the value of MSR on real hardware but I have no access to any. In either case if MorphOS had not assumed that no MMU exception happens while the vectors are wrong but explicitely disabled MMU bits before touching the vectors it would work. The exception happening here during MMU take over is not caused by openbios but by accessing memory by the boot code but openbios needs working MMU for client functions earlier so it cannot disable these bits itself before calling the bootloader unless it could enable during callbacks. So I see no easy way to fix this within openbios.
The stack corruption is happening much later and I don't know yet what causes it but I think it is not diectly related to the above. I will need to dig deeper into this if noone has any hints or insight.
Regards, BALATON Zoltan