Greetings,
I'm unsure whether the issue I'm running into is related to OpenBios, qemu, something buggy in this solaris distribution, or some combination thereof. Feel free to redirect me as appropriate.
I'm able to install Solaris 8 (02/04) from the "Solaris 8 Software disk 1" disk. I see a few different behaviors depending on what I try:
1) boot from disk0:a without any special options:
After specifying a root password, it says "Starting Web Start Launcher in Command Line Mode" and shortly thereafter says "The Webstart launcher has terminated unexpectedly." with no further details, except to say the machine will reboot after pressing enter.
2) Do the same thing, but with -v:
The result doesn't seem predictable. Sometimes it just hangs after the "Starting ..." line, other times it dies with a variety of errors. I'll try to summarize below:
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Starting Web Start Launcher in Command Line Mode. SIGSEGV 11 segmentation violation si_signo [11]: SEGV si_errno [0]: si_code [1]: SEGV_MAPPER [addr: 0x1000100
stackpointer=EE73E578
Inconsistent thread : best efforts attempt (may fail) "Thread-1" (TID:0x34c470, sys_thread_t:0x34c3a8, state:R, thread_t: t@12, threadID:0xee74098, stack_bottom:0xee740d98, stack_size:0x1fd98) prio=5 *current thread*
[1] java.util.ResourceBundle.getBundle(ResourceBundle.java:346) [2] java.awt.Toolkit$3.run(Toolkit.java:887) [3] java.security.AccessController.doPriveleged(Native Method) [4] java.awt.Toolkit.<clinit>(Toolkit.java:882) [5] java.awt.Component.<clinit>(Component.java:336) [6] com.sun.wizards.core.WizardTreeManager.createWizardPanel(WizardTreeManager.java:801) [7] com.sun.wizards.core.WizardTreeManager.<init>(WizardTreeManager.java:265) [8] com.sun.wizards.core.CommandLineConsole.java:78) [9] java.lang.Thread.run(Thread.java:479) ------------------------------
Inconsistent thread : best efforts attempt (may fail) "process reaper" (TID:0x2c4600, sys_thread_t:0x2c4538, state:R, thread_t: t@10, threadID:0xee7b0d98, stack_bottom:0xee7b0d98, stack_size:0x1fd98) prio=5
[1] java.lang.UNIXProcess.waitForProcessExit(Native Method) [2] java.lang.UNIXProcess.access$10(UNIXProcess.java:30) [3] java.lang.UNIXProcess$3.run(UNIXProcess.java:74) --------------------------
Inconsistent thread : best efforts attempt (may fail) "Finalizer" (TID:0x156d80, sys_thread_t:0x156cb8, state:CW, thread_t: t&2, threadID:0xee890d98, stack_bottom:0xee890d98, stack_size:0x1fd98) prio=8
[1] java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:145) [2] java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:167) [3] java.lang.ref.Finalizer$FinalizerWorker$FinalizerThread.run(Finalizer.java:117) -------------------------
Inconsistent thread : best efforts attempt (may fail) "Reference Handler" (TID:0x155e20, sys_thread_t:0x155d58, state:CW, thread_t: t@6, threadID:0xee8c0d98, stack_bottom:0xee8c0d98, stack_size:0x1fd98) prio=10
[1] java.lang.Object.wait(Object.java:417) [2] java.lang.ref.Reference$ReferenceHandler.run(Reference.java:129) -------------------------
Inconsistent thread : best efforts attempt (may fail) "Signal dispatcher" (TID:0x135278, sys_thread_t:0x1351b0, state:MW, thread_t: t@5, threadID:0xee8f0d98, stack_bottom:0xee8f0d98, stack_size:0x1fd98) prio=10
(note the lack of stack trace here)
------------------------
Inconsistent thread : best efforts attempt (may fail) "main" (TID:0x390d0, sys_thread_t:0x39008, state:R, thread_t: t@1, threadID:0x251d0, stack_bottom:0xf0000000, stack_size:0x800000) prio=5
[1] java.io.ObjectInputStream.loadClassO(Native Method) -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
... after which the VM stopped responding.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Starting Web Start Launcher in Command Line Mode. signal fault in critical section signal number: 11, signal code: 1, fault address: 0xe9ff0de8, pc: 0xef40d4a8, sp: 0xee7a2c78 libthread panic: fault in libthread critical section : dumping core (PID: 327 LWP 10) stacktrace: ef40d49c ef40f134 ef408c48 0
Abort - core dumped The Webstart launcher has terminated unexpectedly. (some stuff about the machine rebooting on <enter>) -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
As a side-note, seemingly at random I see the following message a lot (including during that last bit I quoted):
Assertion failed: MUTEX_HELD(&svc_thr_mutex), file ../rpc/svc_run.c, line 757
3) Boot in single-user mode
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Starting Web Start Launcher in Command Line Mode. signal fault in critical section signal number: 11, signal code: 1, fault address: 0xee82005c, pc: 0xef7c7248, sp: 0xee73e198 libthread panic: fault in libthread critical section : dumping core (PID: 330 LWP 9) stacktrace: (...)
Segmentation Fault - core dumped (terminated unexpectedly, yada yada) -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
... where (...) is a huge stack trace. I have to copy this by hand ... ya, I really hope you don't need that. Lemme know if you do, though.
I was able to get the exact same error & stack trace as the sig11 one from (2) above. Addresses were the same, only diferences were the PID, LWP, and sp (330, 9, 0xee752c78)
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Starting Web Start Launcher in Command Line Mode. Fri May 27 15:29:19 2011 fatal: mounting of "/vol" failed signal fault in critical section signal number: 11, signal code: 1, fault address: 0xee780de8, pc: 0xef40d4a8, sp: 0xeb1b2c78 libthread panic: fault in libthread critical section : dumping core (PID: 330 LWP 12) stacktrace: ef40d49c ef40f134 ef408c48 0 Abort - core dumped (blahdy blah) -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
For reference, I'm using a 3g qcow2 image. I originally formatted it as a "SUN2.9G" in the format utility, but I let it autoconfigure the partitions. partition '0' is root (1.18G), '1' is swap (149m), '2' is backup (2.71g), and '7' is home (1.39g). There's overlap in the cylinder ranges specified and the total size between those partitions is larger than 3g ... but, I have no idea whether that's just a Sun-ism, or whether I need to partition manually.
I don't think it's my partitions, though. The 'vol' directory doesn't exist in my root before I try to get the webstart stuff to run, so I think it's creating that directory and trying to mount an image to it.
Additionally, I'm starting qemu with:
./qemu_system_sparc -nographic -bios /imgpath/openbios-sparc32_artyom.bin -hda /imgpath/solaris2.8-img3.qcow2 -m 256 -net nic -net user -cdrom /imgpath/solaris2.8/software_1of2.iso prom-env 'auto-boot?=false' -snapshot
If anyone has a good idea, I'm all ears ... (or is it eyes?)
-Brian
On Sat, May 28, 2011 at 12:45 AM, Brian Vandenberg phantall@gmail.com wrote:
Greetings,
I'm unsure whether the issue I'm running into is related to OpenBios, qemu, something buggy in this solaris distribution, or some combination thereof. Feel free to redirect me as appropriate.
I'm able to install Solaris 8 (02/04) from the "Solaris 8 Software disk 1" disk. I see a few different behaviors depending on what I try:
- boot from disk0:a without any special options:
After specifying a root password, it says "Starting Web Start Launcher in Command Line Mode" and shortly thereafter says "The Webstart launcher has terminated unexpectedly." with no further details, except to say the machine will reboot after pressing enter.
- Do the same thing, but with -v:
The result doesn't seem predictable. Sometimes it just hangs after the "Starting ..." line, other times it dies with a variety of errors. I'll try to summarize below:
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Starting Web Start Launcher in Command Line Mode. SIGSEGV 11 segmentation violation si_signo [11]: SEGV si_errno [0]: si_code [1]: SEGV_MAPPER [addr: 0x1000100
stackpointer=EE73E578
Inconsistent thread : best efforts attempt (may fail) "Thread-1" (TID:0x34c470, sys_thread_t:0x34c3a8, state:R, thread_t: t@12, threadID:0xee74098, stack_bottom:0xee740d98, stack_size:0x1fd98) prio=5 *current thread*
[1] java.util.ResourceBundle.getBundle(ResourceBundle.java:346) [2] java.awt.Toolkit$3.run(Toolkit.java:887) [3] java.security.AccessController.doPriveleged(Native Method) [4] java.awt.Toolkit.<clinit>(Toolkit.java:882) [5] java.awt.Component.<clinit>(Component.java:336) [6] com.sun.wizards.core.WizardTreeManager.createWizardPanel(WizardTreeManager.java:801) [7] com.sun.wizards.core.WizardTreeManager.<init>(WizardTreeManager.java:265) [8] com.sun.wizards.core.CommandLineConsole.java:78) [9] java.lang.Thread.run(Thread.java:479)
Inconsistent thread : best efforts attempt (may fail) "process reaper" (TID:0x2c4600, sys_thread_t:0x2c4538, state:R, thread_t: t@10, threadID:0xee7b0d98, stack_bottom:0xee7b0d98, stack_size:0x1fd98) prio=5
[1] java.lang.UNIXProcess.waitForProcessExit(Native Method) [2] java.lang.UNIXProcess.access$10(UNIXProcess.java:30) [3] java.lang.UNIXProcess$3.run(UNIXProcess.java:74)
Inconsistent thread : best efforts attempt (may fail) "Finalizer" (TID:0x156d80, sys_thread_t:0x156cb8, state:CW, thread_t: t&2, threadID:0xee890d98, stack_bottom:0xee890d98, stack_size:0x1fd98) prio=8
[1] java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:145) [2] java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:167) [3] java.lang.ref.Finalizer$FinalizerWorker$FinalizerThread.run(Finalizer.java:117)
Inconsistent thread : best efforts attempt (may fail) "Reference Handler" (TID:0x155e20, sys_thread_t:0x155d58, state:CW, thread_t: t@6, threadID:0xee8c0d98, stack_bottom:0xee8c0d98, stack_size:0x1fd98) prio=10
[1] java.lang.Object.wait(Object.java:417) [2] java.lang.ref.Reference$ReferenceHandler.run(Reference.java:129)
Inconsistent thread : best efforts attempt (may fail) "Signal dispatcher" (TID:0x135278, sys_thread_t:0x1351b0, state:MW, thread_t: t@5, threadID:0xee8f0d98, stack_bottom:0xee8f0d98, stack_size:0x1fd98) prio=10
(note the lack of stack trace here)
Inconsistent thread : best efforts attempt (may fail) "main" (TID:0x390d0, sys_thread_t:0x39008, state:R, thread_t: t@1, threadID:0x251d0, stack_bottom:0xf0000000, stack_size:0x800000) prio=5
[1] java.io.ObjectInputStream.loadClassO(Native Method)
... after which the VM stopped responding.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Starting Web Start Launcher in Command Line Mode. signal fault in critical section signal number: 11, signal code: 1, fault address: 0xe9ff0de8, pc: 0xef40d4a8, sp: 0xee7a2c78 libthread panic: fault in libthread critical section : dumping core (PID: 327 LWP 10) stacktrace: ef40d49c ef40f134 ef408c48 0
Abort - core dumped The Webstart launcher has terminated unexpectedly. (some stuff about the machine rebooting on <enter>) -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
As a side-note, seemingly at random I see the following message a lot (including during that last bit I quoted):
Assertion failed: MUTEX_HELD(&svc_thr_mutex), file ../rpc/svc_run.c, line 757
- Boot in single-user mode
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Starting Web Start Launcher in Command Line Mode. signal fault in critical section signal number: 11, signal code: 1, fault address: 0xee82005c, pc: 0xef7c7248, sp: 0xee73e198 libthread panic: fault in libthread critical section : dumping core (PID: 330 LWP 9) stacktrace: (...)
Segmentation Fault - core dumped (terminated unexpectedly, yada yada) -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
... where (...) is a huge stack trace. I have to copy this by hand ... ya, I really hope you don't need that. Lemme know if you do, though.
I was able to get the exact same error & stack trace as the sig11 one from (2) above. Addresses were the same, only diferences were the PID, LWP, and sp (330, 9, 0xee752c78)
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Starting Web Start Launcher in Command Line Mode. Fri May 27 15:29:19 2011 fatal: mounting of "/vol" failed signal fault in critical section signal number: 11, signal code: 1, fault address: 0xee780de8, pc: 0xef40d4a8, sp: 0xeb1b2c78 libthread panic: fault in libthread critical section : dumping core (PID: 330 LWP 12) stacktrace: ef40d49c ef40f134 ef408c48 0 Abort - core dumped (blahdy blah) -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
For reference, I'm using a 3g qcow2 image. I originally formatted it as a "SUN2.9G" in the format utility, but I let it autoconfigure the partitions. partition '0' is root (1.18G), '1' is swap (149m), '2' is backup (2.71g), and '7' is home (1.39g). There's overlap in the cylinder ranges specified and the total size between those partitions is larger than 3g ... but, I have no idea whether that's just a Sun-ism, or whether I need to partition manually.
I don't think it's my partitions, though. The 'vol' directory doesn't exist in my root before I try to get the webstart stuff to run, so I think it's creating that directory and trying to mount an image to it.
Additionally, I'm starting qemu with:
./qemu_system_sparc -nographic -bios /imgpath/openbios-sparc32_artyom.bin -hda /imgpath/solaris2.8-img3.qcow2 -m 256 -net nic -net user -cdrom /imgpath/solaris2.8/software_1of2.iso prom-env 'auto-boot?=false' -snapshot
If anyone has a good idea, I'm all ears ... (or is it eyes?)
I suppose this is a problem with QEMU since the OS has started up successfully. The way to verify this would be to check if this happens also with OBP.
If the problem is on the QEMU side, getting relevant QEMU debug logs (-d in_asm,int which needs #define DEBUG_PCALL enabled in target-sparc/op_helper.c) would be needed. Those will be quite large and the relevant info is only near the end.
If the problem is on the QEMU side, getting relevant QEMU debug logs (-d in_asm,int which needs #define DEBUG_PCALL enabled in target-sparc/op_helper.c) would be needed. Those will be quite large and the relevant info is only near the end.
I'm not sure I'll be able to get/make available logs without potentially months worth of waiting for someone to approve the request. I'll have to see if I can find some 02/04 ISOs and do it on my personal laptop.
-Brian
On 28/05/11 07:40, Blue Swirl wrote:
(lots cut)
I was able to get the exact same error& stack trace as the sig11 one from (2) above. Addresses were the same, only diferences were the PID, LWP, and sp (330, 9, 0xee752c78)
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Starting Web Start Launcher in Command Line Mode. Fri May 27 15:29:19 2011 fatal: mounting of "/vol" failed signal fault in critical section signal number: 11, signal code: 1, fault address: 0xee780de8, pc: 0xef40d4a8, sp: 0xeb1b2c78 libthread panic: fault in libthread critical section : dumping core (PID: 330 LWP 12) stacktrace: ef40d49c ef40f134 ef408c48 0 Abort - core dumped (blahdy blah) -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
For reference, I'm using a 3g qcow2 image. I originally formatted it as a "SUN2.9G" in the format utility, but I let it autoconfigure the partitions. partition '0' is root (1.18G), '1' is swap (149m), '2' is backup (2.71g), and '7' is home (1.39g). There's overlap in the cylinder ranges specified and the total size between those partitions is larger than 3g ... but, I have no idea whether that's just a Sun-ism, or whether I need to partition manually.
I don't think it's my partitions, though. The 'vol' directory doesn't exist in my root before I try to get the webstart stuff to run, so I think it's creating that directory and trying to mount an image to it.
Additionally, I'm starting qemu with:
./qemu_system_sparc -nographic -bios /imgpath/openbios-sparc32_artyom.bin -hda /imgpath/solaris2.8-img3.qcow2 -m 256 -net nic -net user -cdrom /imgpath/solaris2.8/software_1of2.iso prom-env 'auto-boot?=false' -snapshot
If anyone has a good idea, I'm all ears ... (or is it eyes?)
I suppose this is a problem with QEMU since the OS has started up successfully. The way to verify this would be to check if this happens also with OBP.
If the problem is on the QEMU side, getting relevant QEMU debug logs (-d in_asm,int which needs #define DEBUG_PCALL enabled in target-sparc/op_helper.c) would be needed. Those will be quite large and the relevant info is only near the end.
Yeah - at this point the kernel should have taken over completely and so I expect that you're hitting an emulation bug (probably the Solaris compiler emits certain instruction sequences not used by gcc which is why this has only just come to light).
In order for someone to fix this, you'll need to supply the information requested by Blue above. Also which version of QEMU are you running?
ATB,
Mark.
Yeah - at this point the kernel should have taken over completely and so I expect that you're hitting an emulation bug (probably the Solaris compiler emits certain instruction sequences not used by gcc which is why this has only just come to light).
Alright. I'll do what I can. Thanks for the response (both of you).
In order for someone to fix this, you'll need to supply the information requested by Blue above. Also which version of QEMU are you running?
0.14.1
-Brian