So, I have Solaris 9 working inside a qemu-system-sparc (32-bit) environment with OpenBIOS, but no such luck getting 64-bit Solaris (8/9/10/11) to boot. I've attached boot output from Solaris verions 8, 9, 10, and 11. Getting a later version (10+) to work would be nice, since it seems that at least up to Solaris 9 doesn't support CPU power management, which means the qemu-system-sparc process runs at 100% CPU usage constantly.
If I need to hook up a debugger and do more digging, I can...just let me know.
Thanks, Nick
-------- This e-mail may contain SEAKR Engineering (SEAKR) Confidential and Proprietary Information. If this message is not intended for you, you are strictly prohibited from using this message, its contents or attachments in any way. If you have received this message in error, please delete the message from your mailbox. This e-mail may contain export-controlled material and should be handled accordingly.
On 2013-Dec-27, 19:08 , Nick Couchman wrote:
[root@qemu-openbios-dev ~]# /opt/qemu/bin/qemu-system-sparc64 -cdrom /mnt/iso/Solaris/sol-10-u9-ga-sparc-dvd.iso -boot d -nographic -m 2048 OpenBIOS for Sparc64 Configuration device id QEMU version 1 machine id 0 kernel cmdline CPUs: 1 x SUNW,UltraSPARC-IIi UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.1 built on Dec 27 2013 23:00 Type 'help' for detailed information Trying cdrom:f... Not a bootable ELF image Not a bootable a.out image
Loading FCode image... Loaded 7420 bytes entry point is 0x4000 Ignoring failed claim for va 1000000 memsz a8cb6! Ignoring failed claim for va 1402000 memsz 4b4a8! Ignoring failed claim for va 1800000 memsz 61b48!
Those are a problem. My recollection is that it's loading Solaris (genunix) in at 100.0000. Failing those claims means something is busted with the memory model.
Jumping to entry point 00000000010071d8 for type 0000000000000001... switching to new context: entry point 0x10071d8 stack 0x00000000ffe86a01 warning:interpret: exception -13 caught
That's a problem. No idea what an exception -13 means. I'm surprised it was able to continue after that.
SunOS Release 5.10 Version Generic_142909-17 64-bit Copyright (c) 1983, 2010, Oracle and/or its affiliates. All rights reserved. spacex@:interpret: exception -13 caught
spacex@ is an forth operation to do a 64-bit load from an arbitrary address space. Most of them are used to access physical memory, for things like manipulating MMUs, although there are a few for dealing with some of the IO devices. This sounds like somewhere QEMUs machine model doesn't directly match what Solaris 5.10 thinks it should.
could not find debugger-vocabulary-hook>threads:interpret: exception -13 caught (Can't load tod module) EXIT
That looks like Solaris is trying to panic, and something had trashed the forth vocabulary. Debugger-vocabulary-hook is a defer word in openboot which is invoked on enter-forth, I suspect the >threads is from Solaris' forthdebug module.
On 2013/12/27 at 17:54, Tarl Neustaedter tarl-b2@tarl.net wrote:
On 2013-Dec-27, 19:08 , Nick Couchman wrote:
[root@qemu-openbios-dev ~]# /opt/qemu/bin/qemu-system-sparc64 -cdrom
/mnt/iso/Solaris/sol-10-u9-ga-sparc-dvd.iso -boot d -nographic -m 2048
OpenBIOS for Sparc64 Configuration device id QEMU version 1 machine id 0 kernel cmdline CPUs: 1 x SUNW,UltraSPARC-IIi UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.1 built on Dec 27 2013 23:00 Type 'help' for detailed information Trying cdrom:f... Not a bootable ELF image Not a bootable a.out image
Loading FCode image... Loaded 7420 bytes entry point is 0x4000 Ignoring failed claim for va 1000000 memsz a8cb6! Ignoring failed claim for va 1402000 memsz 4b4a8! Ignoring failed claim for va 1800000 memsz 61b48!
Those are a problem. My recollection is that it's loading Solaris (genunix) in at 100.0000. Failing those claims means something is busted with the memory model.
So, starting with the first error(s), first, is this an OpenBIOS issue or a Qemu issue? Is there additional (gdb?) information I can provide that will help further nail down the problem?
-Nick
-------- This e-mail may contain SEAKR Engineering (SEAKR) Confidential and Proprietary Information. If this message is not intended for you, you are strictly prohibited from using this message, its contents or attachments in any way. If you have received this message in error, please delete the message from your mailbox. This e-mail may contain export-controlled material and should be handled accordingly.
On 2013-Dec-28, 00:08 , Nick Couchman wrote:
Ignoring failed claim for va 1000000 memsz a8cb6! Ignoring failed claim for va 1402000 memsz 4b4a8! Ignoring failed claim for va 1800000 memsz 61b48!
Those are a problem. My recollection is that it's loading Solaris (genunix) in at 100.0000. Failing those claims means something is busted with the memory model.
So, starting with the first error(s), first, is this an OpenBIOS issue or a Qemu issue? Is there additional (gdb?) information I can provide that will help further nail down the problem?
At a guess, I'd start by finding that error message and figuring out why the claims failed. Was this a double allocation or simply an address range which QEMU wasn't expecting, ...
On Sat, Dec 28, 2013 at 1:54 AM, Tarl Neustaedter tarl-b2@tarl.net wrote:
On 2013-Dec-27, 19:08 , Nick Couchman wrote:
[root@qemu-openbios-dev ~]# /opt/qemu/bin/qemu-system-sparc64 -cdrom /mnt/iso/Solaris/sol-10-u9-ga-sparc-dvd.iso -boot d -nographic -m 2048 OpenBIOS for Sparc64 Configuration device id QEMU version 1 machine id 0 kernel cmdline CPUs: 1 x SUNW,UltraSPARC-IIi UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.1 built on Dec 27 2013 23:00 Type 'help' for detailed information Trying cdrom:f... Not a bootable ELF image Not a bootable a.out image
Loading FCode image... Loaded 7420 bytes entry point is 0x4000 Ignoring failed claim for va 1000000 memsz a8cb6! Ignoring failed claim for va 1402000 memsz 4b4a8! Ignoring failed claim for va 1800000 memsz 61b48!
Those are a problem. My recollection is that it's loading Solaris (genunix) in at 100.0000. Failing those claims means something is busted with the memory model.
Jumping to entry point 00000000010071d8 for type 0000000000000001... switching to new context: entry point 0x10071d8 stack 0x00000000ffe86a01 warning:interpret: exception -13 caught
That's a problem. No idea what an exception -13 means. I'm surprised it was able to continue after that.
SunOS Release 5.10 Version Generic_142909-17 64-bit Copyright (c) 1983, 2010, Oracle and/or its affiliates. All rights reserved. spacex@:interpret: exception -13 caught
spacex@ is an forth operation to do a 64-bit load from an arbitrary address space. Most of them are used to access physical memory, for things like manipulating MMUs, although there are a few for dealing with some of the IO devices. This sounds like somewhere QEMUs machine model doesn't directly match what Solaris 5.10 thinks it should.
This one is really an OpenBIOS error: the spacex@ word is just not implemented.
Artyom
could not find debugger-vocabulary-hook>threads:interpret: exception -13 caught (Can't load tod module) EXIT
That looks like Solaris is trying to panic, and something had trashed the forth vocabulary. Debugger-vocabulary-hook is a defer word in openboot which is invoked on enter-forth, I suspect the >threads is from Solaris' forthdebug module.
On 28/12/13 00:54, Tarl Neustaedter wrote:
On 2013-Dec-27, 19:08 , Nick Couchman wrote:
[root@qemu-openbios-dev ~]# /opt/qemu/bin/qemu-system-sparc64 -cdrom /mnt/iso/Solaris/sol-10-u9-ga-sparc-dvd.iso -boot d -nographic -m 2048 OpenBIOS for Sparc64 Configuration device id QEMU version 1 machine id 0 kernel cmdline CPUs: 1 x SUNW,UltraSPARC-IIi UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.1 built on Dec 27 2013 23:00 Type 'help' for detailed information Trying cdrom:f... Not a bootable ELF image Not a bootable a.out image
Loading FCode image... Loaded 7420 bytes entry point is 0x4000 Ignoring failed claim for va 1000000 memsz a8cb6! Ignoring failed claim for va 1402000 memsz 4b4a8! Ignoring failed claim for va 1800000 memsz 61b48!
Those are a problem. My recollection is that it's loading Solaris (genunix) in at 100.0000. Failing those claims means something is busted with the memory model.
(Disclaimer: it's probably been at least a year since I last looked at this in detail)
I seem to remember that the memory regions in question are remapped to two different physical addresses by the bootloader at two different points during boot; the remapping is allowed to happen but OFMEM drops a warning onto the serial console as at the time I wasn't sure if this was a valid thing to do or not.
Jumping to entry point 00000000010071d8 for type 0000000000000001... switching to new context: entry point 0x10071d8 stack 0x00000000ffe86a01 warning:interpret: exception -13 caught
That's a problem. No idea what an exception -13 means. I'm surprised it was able to continue after that.
From memory I think this is the chunk of Forth code in the Solaris kernel that sets up something on HME hardware (which QEMU currently does not emulate).
SunOS Release 5.10 Version Generic_142909-17 64-bit Copyright (c) 1983, 2010, Oracle and/or its affiliates. All rights reserved. spacex@:interpret: exception -13 caught
spacex@ is an forth operation to do a 64-bit load from an arbitrary address space. Most of them are used to access physical memory, for things like manipulating MMUs, although there are a few for dealing with some of the IO devices. This sounds like somewhere QEMUs machine model doesn't directly match what Solaris 5.10 thinks it should.
I'm fairly sure this error occurs because spacex@ isn't yet implemented in OpenBIOS. It wouldn't be too difficult to add it though.
could not find debugger-vocabulary-hook>threads:interpret: exception -13 caught (Can't load tod module) EXIT
That looks like Solaris is trying to panic, and something had trashed the forth vocabulary. Debugger-vocabulary-hook is a defer word in openboot which is invoked on enter-forth, I suspect the >threads is from Solaris' forthdebug module.
Again I think the fault here is that the debugger-vocabulary-hook>threads defer doesn't exist in OpenBIOS at the moment. But also the RTC hardware isn't currently mapped correctly in QEMU - it's currently mapped as ioport rather than MMIO IIRC. Artyom had some patches to fix this but I never managed to apply them (and the corresponding QEMU changes).
ATB,
Mark.
On 28/12/13 00:08, Nick Couchman wrote:
[root@qemu-openbios-dev ~]# /opt/qemu/bin/qemu-system-sparc64 -cdrom /mnt/iso/Solaris/sol9-software1.iso -boot d -nographic -m 2048 OpenBIOS for Sparc64 Configuration device id QEMU version 1 machine id 0 kernel cmdline CPUs: 1 x SUNW,UltraSPARC-IIi UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.1 built on Dec 27 2013 23:00 Type 'help' for detailed information Trying cdrom:f... Not a bootable ELF image Not a bootable a.out image
Loading FCode image... Loaded 5936 bytes entry point is 0x4000 open isn't unique.
Jumping to entry point 0000000000100000 for type 0000000000000001... switching to new context: entry point 0x100000 stack 0x00000000ffe86a01 warning:interpret: exception -13 caught SunOS Release 5.9 Version Generic 64-bit Copyright 1983-2002 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms.
panic[cpu0]/thread=140a000: VAC too big!
00000000014098c0 unix:startup_modules+118 (12000, 142a400, 30000018000, 103, 78004000, 1430800) %l0-3: 0000000000000103 0000000000000000 000000000140c000 00000000014175a8 %l4-7: 00000000014175b8 0000000078004000 0000000000002000 0000000000012020 0000000001409970 unix:startup+1c (0, 20, 78002000, 2000, 1400000, 0) %l0-3: 000000000140b400 000000000140b400 0000000001477c00 0000000000002000 %l4-7: 00000000000c36c5 0000000000000000 0000000000003f61 000000000147d848 0000000001409a20 genunix:main+4 (1409ba0, ffd0cad8, 1409ec0, 325b5f, 2000, 500) %l0-3: 000000000140b400 0000000000000000 0000000001412a28 0000000078002000 %l4-7: 000000000140a000 0000000000326000 000000000147fd60 000000000105e938
skipping system dump - no dump device configured rebooting... BOOTpanic - kernel: prom_reboot: reboot call returned! EXIT
I see the "VAC too big!" error on my local copy of 32-bit Solaris 9 too. Some searching showed the message occurs in the OpenSolaris source at http://fxr.watson.org/fxr/source/sfmmu/vm/hat_sfmmu.c?v=OPENSOLARIS#L1347.
I'm not exactly sure what this is trying to check here, but I do know that OpenBIOS uses 512K PTEs to map itself, while calls to OFMEM use 8K PTEs which may be relevant. Does anyone know the exact significance of this?
ATB,
Mark.
I see the "VAC too big!" error on my local copy of 32-bit Solaris 9 too. Some searching showed the message occurs in the OpenSolaris source at http://fxr.watson.org/fxr/source/sfmmu/vm/hat_sfmmu.c?v=OPENSOLARIS#L1347.
I'm not exactly sure what this is trying to check here, but I do know that OpenBIOS uses 512K PTEs to map itself, while calls to OFMEM use 8K PTEs which may be relevant. Does anyone know the exact significance of this?
Hmmm...I do not recall seeing that VAC too big error on Solaris 9 32-bit (on 32-bit Qemu/OpenBIOS), but if you're seeing it and Solaris 9 boots fine for you, then perhaps that's a Red Herring and not worth running down. Maybe I'll focus on trying to track down the source of some of the other error messages that seem to contribute to not being able to boot 64-bit Solaris - particularly Solaris 10 and higher, since I think that's where the CPU Power Management starts, which would be great to get working for the CPU performance aspect of running Qemu.
-Nick
-------- This e-mail may contain SEAKR Engineering (SEAKR) Confidential and Proprietary Information. If this message is not intended for you, you are strictly prohibited from using this message, its contents or attachments in any way. If you have received this message in error, please delete the message from your mailbox. This e-mail may contain export-controlled material and should be handled accordingly.
On 29/12/13 05:01, Nick Couchman wrote:
I see the "VAC too big!" error on my local copy of 32-bit Solaris 9 too. Some searching showed the message occurs in the OpenSolaris source at http://fxr.watson.org/fxr/source/sfmmu/vm/hat_sfmmu.c?v=OPENSOLARIS#L1347.
I'm not exactly sure what this is trying to check here, but I do know that OpenBIOS uses 512K PTEs to map itself, while calls to OFMEM use 8K PTEs which may be relevant. Does anyone know the exact significance of this?
Hmmm...I do not recall seeing that VAC too big error on Solaris 9 32-bit (on 32-bit Qemu/OpenBIOS), but if you're seeing it and Solaris 9 boots fine for you, then perhaps that's a Red Herring and not worth running down. Maybe I'll focus on trying to track down the source of some of the other error messages that seem to contribute to not being able to boot 64-bit Solaris - particularly Solaris 10 and higher, since I think that's where the CPU Power Management starts, which would be great to get working for the CPU performance aspect of running Qemu.
-Nick
Gah. Sorry Nick - this message was definitely a case of "fingers before brain. What I meant to say was that I see the "VAC too big!" error on my local copy of *64-bit* Solaris 9 too. I currently don't have access to a Solaris 9 32-bit image for testing, but it sounds as if it is working fine for you which is great.
ATB,
Mark.
On Sun, Dec 29, 2013 at 12:10 PM, Mark Cave-Ayland mark.cave-ayland@ilande.co.uk wrote:
On 29/12/13 05:01, Nick Couchman wrote:
I see the "VAC too big!" error on my local copy of 32-bit Solaris 9 too. Some searching showed the message occurs in the OpenSolaris source at
http://fxr.watson.org/fxr/source/sfmmu/vm/hat_sfmmu.c?v=OPENSOLARIS#L1347.
I'm not exactly sure what this is trying to check here, but I do know that OpenBIOS uses 512K PTEs to map itself, while calls to OFMEM use 8K PTEs which may be relevant. Does anyone know the exact significance of this?
Hmmm...I do not recall seeing that VAC too big error on Solaris 9 32-bit (on 32-bit Qemu/OpenBIOS), but if you're seeing it and Solaris 9 boots fine for you, then perhaps that's a Red Herring and not worth running down. Maybe I'll focus on trying to track down the source of some of the other error messages that seem to contribute to not being able to boot 64-bit Solaris - particularly Solaris 10 and higher, since I think that's where the CPU Power Management starts, which would be great to get working for the CPU performance aspect of running Qemu.
-Nick
Gah. Sorry Nick - this message was definitely a case of "fingers before brain. What I meant to say was that I see the "VAC too big!" error on my local copy of *64-bit* Solaris 9 too. I currently don't have access to a Solaris 9 32-bit image for testing, but it sounds as if it is working fine for you which is great.
Actually "VAC too big" is a pretty nice error: it should happen in the early sfmmu initialization phase, but for some reason I never get it. I tried to make the boot process more verbose, but for this I'd need a working kadb. Maybe it's easy to make it working? With CIF_DEBUG booting kadb looks like this:
Jumping to entry point 0000000000100000 for type 0000000000000001... switching to new context: entry point 0x100000 stack 0x00000000ffe86a01 finddevice("/chosen") = 0x00000000ffe1bed0 getproplen(0x00000000ffe1bed0, "mmu") = 0x0000000000000004 getproplen(0x00000000ffe1bed0, "mmu") = 0x0000000000000004 getprop(0x00000000ffe1bed0, "mmu", 0x000000000013e02c, 4) = service getprop: possible argument error (0 1)
^^^ The message about possible argument error keeps appearing on every call after this point. [...]
getprop(0x00000000ffe30940, "clock-frequency", 0x0000000000138608, 4) = service getprop: possible argument error (0 1) 3990516880 0x00138608 05 f5 e1 00 __ __ __ __ __ __ __ __ __ __ __ __ .��. getproplen(0x00000000ffe30940, "status") = 0xffffffffffffffff
^^^ are we missing "status" property?
child(0x00000000ffe30940) = 0x0000000000000000 peer(0x00000000ffe30940) = 0x0000000000000000 peer(0x00000000ffe1b838) = 0x0000000000000000 of_client_interface: interpret 00000000eddac894 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 interpret : kadb_callback %pc dup f000.0000 ffff.ffff between if drop exit then h# eddda630 x! %npc h# eddb0758 x! %g1 h# eddda6d0 x! %g2 h# eddda6d8 x! %g3 h# eddda6e0 x! %g4 h# eddda720 x! %g5 h# eddda728 x! %g6 h# eddda730 x! %g7 h# eddda738 x! 1 %tstate h# eddaf070 x! 1 %tt h# eddd9e48 l! h# edd008ec set-pc go ; ([6] -- [0]) %pc:interpret: exception -13 caught
^^^ I guess things go wrong after this point.
interpret ': kadb_callback %pc dup f000.0000 ffff.ffff between if drop exit then h# eddda630 x! %npc h# eddb0758 x! %g1 h# eddda6d0 x! %g2 h# eddda6d8 x! %g3 h# eddda6e0 x! %g4 h# eddda720 x! %g5 h# eddda728 x! %g6 h# eddda730 x! %g7 h# eddda738 x! 1 %tstate h# eddaf070 x! 1 %tt h# eddd9e48 l! h# edd008ec set-pc go ; ': possible argument error (4--0) got 0 handle_calls return: of_client_interface: interpret 00000000edd28010 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 interpret ['] kadb_callback init-debugger-hook ([6] -- [0]) kadb_callback:interpret: exception -13 caught interpret ' ['] kadb_callback init-debugger-hook ': possible argument error (4--0) got 0
And then kadb prompt appears but it can't start executing the kernel:
kadb[0]: :c of_client_interface: call-method 00000000eddac6e0 00000000ffc847f0 0000000001006000 call-method translate ([3] -- [5]) handle_calls return: 0000000000000000 ffffffffffffffff 0000000000000032 0000000000000000 000000001f806000 of_client_interface: call-method 00000000eddac668 00000000ffc847f0 0000000000000033 0000000000002000 00000000ffc7e000 0000000000000000 000000001f806000 call-method map ([7] -- [1]) handle_calls return: 0000000000000000 of_client_interface: call-method 00000000eddac6c8 00000000ffc847f0 0000000000002000 00000000ffc7e000 call-method unmap ([4] -- [0]) call-method 'unmap': possible argument error (2--0) got 0 handle_calls return: Unhandled Exception 0x000000000000017e PC = 0x0000000001006f90 NPC = 0x0000000001006f94 Stopping execution
Artyom