Hi all,
So given that the Fcode evaluator is sorted for Solaris 10, I thought I'd try my original Solaris 9 disk image again and was surprised to find that it didn't boot.
Further investigation seems to show that strangely encoded device names are being passed into cif-open which is failing:
: get-file ( 6000 ffe35870 27 ) 00000000ffe36378: fname>devname$ 3 > 2dup type /platform/OpenBiosTeam,OpenBIOS/ufsboot ok 3 > resume ok ( 6000 ffe35870 27 ) 00000000ffe36378: fname>devname$ ( 6000 ffe35ee8 2f ) 00000000ffe36380: ufs-fopen 3 > 2dup type cdrom:a,|platform|OpenBiosTeam,OpenBIOS|ufsboot ok 3 > resume ok ( 6000 ffe35ee8 2f ) 00000000ffe36380: ufs-fopen : ufs-fopen ( 6000 ffe35ee8 2f ) 00000000ffe360e8: drop ( 6000 ffe35ee8 ) 00000000ffe360f0: cif-open ( 6000 0 ) 00000000ffe360f8: (semis)
: get-file ( 6000 ffe35870 17 ) 00000000ffe36378: fname>devname$ 3 > 2dup type /platform/sun4u/ufsboot ok 3 > resume ok ( 6000 ffe35870 17 ) 00000000ffe36378: fname>devname$ ( 6000 ffe35ee8 1f ) 00000000ffe36380: ufs-fopen 3 > 2dup type cdrom:a,|platform|sun4u|ufsboot ok 3 > resume ok ( 6000 ffe35ee8 1f ) 00000000ffe36380: ufs-fopen : ufs-fopen ( 6000 ffe35ee8 1f ) 00000000ffe360e8: drop ( 6000 ffe35ee8 ) 00000000ffe360f0: cif-open OFMEM: ofmem_claim_virt virt=ffffffffffffffff size=0000000000000200 align=0000000000000001 ( 6000 0 ) 00000000ffe360f8: (semis)
It looks as if a special type of device-specifier is being passed into cif-open consisting of a device and argument, a comma, then the actual filename required with /s replaced by |s.
Is this some kind of special syntax that needs to be taught to the dev word and/or cif-open words?
ATB,
Mark.
Hi all,
So given that the Fcode evaluator is sorted for Solaris 10, I thought I'd try my original Solaris 9 disk image again and was surprised to find that it didn't boot.
Further investigation seems to show that strangely encoded device names are being passed into cif-open which is failing: [...] 3 > 2dup type cdrom:a,|platform|OpenBiosTeam,OpenBIOS|ufsboot ok [...] 3 > 2dup type cdrom:a,|platform|sun4u|ufsboot ok [...] It looks as if a special type of device-specifier is being passed into cif-open consisting of a device and argument, a comma, then the actual filename required with /s replaced by |s.
Is this some kind of special syntax that needs to be taught to the dev word and/or cif-open words?
Ah, yes. The issue here is that the full pathname is something like:
/pci@400/scsi@3/disk@0:a|platform|sun4u|ufsboot
An argument (the stuff after ":") cannot contain a slash (or it would become a path element), so ufs filenames have the slashes replaced with vertical bars. But this should have been entirely between the secondary booter and Solaris - I'm not sure why you are tripping over it.
Tarl Neustaedter wrote:
[resurrecting a very old thread]
Hi all,
So given that the Fcode evaluator is sorted for Solaris 10, I thought I'd try my original Solaris 9 disk image again and was surprised to find that it didn't boot.
Further investigation seems to show that strangely encoded device names are being passed into cif-open which is failing: [...] 3 > 2dup type cdrom:a,|platform|OpenBiosTeam,OpenBIOS|ufsboot ok [...] 3 > 2dup type cdrom:a,|platform|sun4u|ufsboot ok [...] It looks as if a special type of device-specifier is being passed into cif-open consisting of a device and argument, a comma, then the actual filename required with /s replaced by |s.
Is this some kind of special syntax that needs to be taught to the dev word and/or cif-open words?
Ah, yes. The issue here is that the full pathname is something like:
/pci@400/scsi@3/disk@0:a|platform|sun4u|ufsboot
An argument (the stuff after ":") cannot contain a slash (or it would become a path element), so ufs filenames have the slashes replaced with vertical bars. But this should have been entirely between the secondary booter and Solaris - I'm not sure why you are tripping over it.
So I've just had a little play with my Solaris 9 disk image again, and I can confirm that the reason the kernel is unable to load is because of the arguments being passed to the CIF open method. Here is the trace using the Forth debugger:
: do-boot ( Empty ) 00000000ffe36838: sign-on ( Empty ) 00000000ffe36840: real-devname ( Empty ) 00000000ffe36848: halt? ( Empty ) 00000000ffe36850: do?branch ( ) 00000000ffe368a0: loader-base ( Empty ) 00000000ffe368a8: plat-booter$ ( ffe356f8 27 ) 00000000ffe368b0: diagnostic-mode? ( ffe356f8 27 0 ) 00000000ffe368b8: do?branch ( ffe356f8 27 ) 00000000ffe36908: get-file ( Empty ) 00000000ffe36910: do?branch ( ) 00000000ffe36920: loader-base ( Empty ) 00000000ffe36928: def-booter$ ( ffe356f8 17 ) 00000000ffe36930: diagnostic-mode? ( ffe356f8 17 0 ) 00000000ffe36938: do?branch ( ffe356f8 17 ) 00000000ffe36988: get-file : get-file ( ffe356f8 17 ) 00000000ffe36200: fname>devname$ ( ffe35d70 1c ) 00000000ffe36208: ufs-fopen Stepper keys: <space>/<enter> Up Down Trace Rstack Forth
2 > 2dup type disk,|platform|sun4u|ufsboot ok 2 > resume ok ( ffe35d70 1c ) 00000000ffe36208: ufs-fopen : ufs-fopen ( ffe35d70 1c ) 00000000ffe35f70: drop ( ffe35d70 ) 00000000ffe35f78: cif-open ( 0 ) 00000000ffe35f80: (semis) [ Finished ufs-fopen ] ( 0 ) 00000000ffe36210: ?dup ( 0 ) 00000000ffe36218: do?branch ( Empty ) 00000000ffe362e0: drop ( ) 00000000ffe362e8: -1 ( Empty ) 00000000ffe362f0: (semis) [ Finished get-file ] ( Empty ) 00000000ffe36990: do?branch ( ) 00000000ffe369a0: (") ( 11 ) 00000000ffe369c8: type Boot load failed. ( ) 00000000ffe369d0: cr ( ) 00000000ffe369d8: exit Stack Underflow. 0 >
I took the liberty of using GDB to dump out the Fcode to a file and run it through detok so I could get a better idea of the Forth being executed.
The interesting part is that I can see the code to decode the arguments is all present in /packages/ufs-file-system in the "open" word, which makes me wonder if cif-open should somehow be calling this method instead. However, the OpenBIOS definition of cif-open currently looks like this:
: open ( dev_spec -- ihandle|0 ) dup cstrlen open-dev ;
So with an argument string of "disk,|platform|sun4u|ufsboot", OpenBIOS goes and opens the "disk" device passing in an argument string of "|platform|sun4u|ufsboot" which is doomed to fail because it is not a valid file reference.
The part I'm missing is how Fcode calling the CIF "open" word can invoke the "open" word within the /packages/ufs-file-system package which should do the right thing - anyone have any bright ideas?
ATB,
Mark.
On 2010-8-15 9:21 AM, Mark Cave-Ayland wrote:
[...] The part I'm missing is how Fcode calling the CIF "open" word can invoke the "open" word within the /packages/ufs-file-system package which should do the right thing - anyone have any bright ideas?
There is an interpose missing in there...
Ah, yes. The scsidisk.fth does an open-package of "disk-label" (found in obp/pkg/boot/sunlabel.fth of the open-sourced openboot), which in turn does an interpose of "ufs-file-system". The interpose means that subsequent calls to open and any other functions get directed to the interposing package, which can only get to the scsidisk package by $call-parent.
Tarl Neustaedter wrote:
There is an interpose missing in there...
Ah, yes. The scsidisk.fth does an open-package of "disk-label" (found in obp/pkg/boot/sunlabel.fth of the open-sourced openboot), which in turn does an interpose of "ufs-file-system". The interpose means that subsequent calls to open and any other functions get directed to the interposing package, which can only get to the scsidisk package by $call-parent.
So it was a hard-coded case within /packages/disk-label after all - that explains why I couldn't figure out what was going on. Sigh.
I added a quick bit of code in packages/sun-parts.c to detect this case, but was surprised to see that it wasn't getting triggered. Some further investigation shows that device strings of the form "disk,|foo|" are being rejected by the Forth in open-dev, i.e.
s" disk" open-dev u. -> returns a valid ihandle s" disk,|foo|" open-dev u. -> returns false
I'm guessing that we should simply just split on a ',' in the same way as we would do with a ':' if it were present, i.e. and pass the remainder of the string as the package argument?
ATB,
Mark.
On 2010-8-15 3:35 PM, Mark Cave-Ayland wrote:
[...] I added a quick bit of code in packages/sun-parts.c to detect this case, but was surprised to see that it wasn't getting triggered. Some further investigation shows that device strings of the form "disk,|foo|" are being rejected by the Forth in open-dev, i.e.
s" disk" open-dev u. -> returns a valid ihandle s" disk,|foo|" open-dev u. -> returns false
I'm guessing that we should simply just split on a ',' in the same way as we would do with a ':' if it were present, i.e. and pass the remainder of the string as the package argument?
Oh. I didn't notice the lack of colon.
No, you can't just split on a comma, that's a legitimate part of either the name or unit address. That should be:
disk@<unit>:|foo|dev
Both the nodename and unit address are allowed the have commas (e.g., "QLGC,isptwo" is the name of a SCSI adapter from Qlogic, and "@0,1" means target zero, unit 1).
The question is why are you missing both the unit address and the colon separator for the argument. The right answer is to make sure you're following the path resolution in section 4.3.1 in 1275.
Tarl Neustaedter wrote:
Oh. I didn't notice the lack of colon.
No, you can't just split on a comma, that's a legitimate part of either the name or unit address. That should be:
disk@<unit>:|foo|dev
Both the nodename and unit address are allowed the have commas (e.g., "QLGC,isptwo" is the name of a SCSI adapter from Qlogic, and "@0,1" means target zero, unit 1).
The question is why are you missing both the unit address and the colon separator for the argument. The right answer is to make sure you're following the path resolution in section 4.3.1 in 1275.
Actually I've now realised that we already have a work around for this. For OpenSolaris debugging, I've normally been invoking boot like this:
boot cdrom -H
...which has always worked fine. However, the latest rework to the bootargs parameter code now explicitly gives a slice during boot and so it will actually work when not being debugged! Hence what I need to do is actually this:
boot cdrom:f -H
This now calls my new code and I can attach the debugger to the "open" method in ufs-file-system - but for some reason it crashes in the I/O code called from deblocker when "read" is invoked.
I also think there is something else strange going on with the emulation in Qemu. If I use my disk image for a debugging session, it works fine. However if I attempt to use a Solaris 9 install CD image with the same-size Fcode block then I get all sorts of strange Forth errors such as failing to find the "do-boot" word. This suggests there is something not right with the memory management or I/O somewhere :(
ATB,
Mark.
Tarl Neustaedter wrote:
On 2010-8-15 9:21 AM, Mark Cave-Ayland wrote:
[...] The part I'm missing is how Fcode calling the CIF "open" word can invoke the "open" word within the /packages/ufs-file-system package which should do the right thing - anyone have any bright ideas?
There is an interpose missing in there...
Ah, yes. The scsidisk.fth does an open-package of "disk-label" (found in obp/pkg/boot/sunlabel.fth of the open-sourced openboot), which in turn does an interpose of "ufs-file-system". The interpose means that subsequent calls to open and any other functions get directed to the interposing package, which can only get to the scsidisk package by $call-parent.
Right. I've just managed to fix the memory issue I found in the Solaris 9 Fcode (it was a bug in the SPARC64 CIF implementation) and added a similar interpose behaviour described above. I'm now please to report that the Solaris 9 kernel from a Solaris 9 install CD starts to run, although it dies quite early on:
OpenBIOS for Sparc64 Configuration device id QEMU version 1 machine id 0 kernel cmdline CPUs: 1 x SUNW,UltraSPARC-IIi UUID: 00000000-0000-0000-0000-000000000000 Welcome to OpenBIOS v1.0 built on Aug 22 2010 21:23 Type 'help' for detailed information
0 > boot Trying cdrom:f... Not a bootable ELF image Not a bootable a.out image
Loading FCode image... Loaded 5936 bytes entry point is 0x4000 open isn't unique.
Jumping to entry point 0000000000100000 for type 0000000000000001... switching to new context: entry point 0x100000 stack 0x00000000ffe02b59 panic - boot: Cannot get list.
EXIT -1 >
Turning on the CIF debugging shows a couple of property lookups before this point, but nothing that gives much of a hint as to why the boot fails.
ATB,
Mark.
On 2010-8-22 5:37 PM, Mark Cave-Ayland wrote:
[...] Jumping to entry point 0000000000100000 for type 0000000000000001... switching to new context: entry point 0x100000 stack 0x00000000ffe02b59 panic - boot: Cannot get list.
Boy, that message *is* pretty obscure.
The list he's referring to is one of three lists - physical memory, virtual memory and free memory. He looks in the "memory" node for both "reg" and "available" properties, and in the "virtual-memory" node for the "available" property.
If he doesn't find any one of those three properties, he gets that error. Look for init_memlists() in usr/src/uts/psm/.... in your stashed Solaris source.
Tarl Neustaedter wrote:
Boy, that message *is* pretty obscure.
The list he's referring to is one of three lists - physical memory, virtual memory and free memory. He looks in the "memory" node for both "reg" and "available" properties, and in the "virtual-memory" node for the "available" property.
If he doesn't find any one of those three properties, he gets that error. Look for init_memlists() in usr/src/uts/psm/.... in your stashed Solaris source.
Oh, that's interesting. So this is what I currently see in OpenBIOS:
0 > cd /memory ok 0 > .properties name "memory" device_type "memory" available -- 70 : 0 0 0 0 0 0 0 0 0 0 0 0 7 e6 80 0 0 0 0 0 7 e6 80 80 0 0 0 0 0 0 f 80 0 0 0 0 7 e6 f2 0 0 0 0 0 0 0 e 0 0 0 0 0 7 e7 0 80 0 0 0 0 0 0 f 80 0 0 0 0 7 e7 72 0 0 0 0 0 0 0 e 0 0 0 0 0 7 e7 80 80 0 0 0 0 0 0 f 80 0 0 0 0 7 e7 f2 0 0 0 0 0 0 0 e 0 reg -- 10 : 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 ok 0 > cd /virtual-memory ok 0 > .properties name "virtual-memory" available -- 90 : 0 0 0 0 0 0 0 0 0 0 0 0 7 fe 80 0 0 0 0 0 7 fe 80 80 0 0 0 0 0 0 f 80 0 0 0 0 7 fe f2 0 0 0 0 0 0 0 e 0 0 0 0 0 7 ff 0 80 0 0 0 0 0 0 f 80 0 0 0 0 7 ff 72 0 0 0 0 0 0 0 e 0 0 0 0 0 7 ff 80 80 0 0 0 0 0 0 f 80 0 0 0 0 7 ff f2 0 0 0 0 0 f6 0 e 0 0 0 0 0 fe 80 0 0 0 0 0 0 1 50 0 0 0 0 0 0 ff f8 0 0 ff ff ff ff 0 7 ff ff translations -- 210 : 0 0 0 0 0 0 0 0 0 0 0 0 7 fe 80 0 0 0 0 0 0 0 0 36 0 0 0 0 7 fe 80 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 32 0 0 0 0 7 fe 90 0 0 0 0 0 0 0 20 0 0 0 0 0 0 0 0 32 0 0 0 0 7 fe b0 0 0 0 0 0 0 0 20 0 0 0 0 0 0 0 0 32 0 0 0 0 7 fe d0 0 0 0 0 0 0 0 20 0 0 0 0 0 0 0 0 32 0 0 0 0 7 fe f0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 32 0 0 0 0 7 ff 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 32 0 0 0 0 7 ff 10 0 0 0 0 0 0 0 20 0 0 0 0 0 0 0 0 32 0 0 0 0 7 ff 30 0 0 0 0 0 0 0 20 0 0 0 0 0 0 0 0 32 0 0 0 0 7 ff 50 0 0 0 0 0 0 0 20 0 0 0 0 0 0 0 0 32 0 0 0 0 7 ff 70 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 32 0 0 0 0 7 ff 80 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 32 0 0 0 0 7 ff 90 0 0 0 0 0 0 0 20 0 0 0 0 0 0 0 0 32 0 0 0 0 7 ff b0 0 0 0 0 0 0 0 20 0 0 0 0 0 0 0 0 32 0 0 0 0 7 ff d0 0 0 0 0 0 0 0 20 0 0 0 0 0 0 0 0 32 0 0 0 0 7 ff f0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 32 0 0 0 0 fe 0 0 0 0 0 0 0 0 80 0 0 0 0 0 0 0 0 0 76 0 0 0 0 ff d0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 74 0 0 0 0 ff d8 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 74 0 0 0 0 ff e0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 76 0 0 0 0 ff e8 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 76 0 0 0 0 ff f0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 76 ok 0 > cd /chosen ok 0 > .properties name "chosen" stdin ffe88290 stdout ffe88430 memory ffe88838 mmu ffe880f8 display ffe87e30 keyboard ffe87fd0 bootargs "" bootpath "cdrom:f" ok 0 > show-devs ffe1d720 / ffe1d8f8 /aliases ffe1da20 /openprom (BootROM) ffe27358 /openprom/client-services ffe1dcd8 /options ffe1ddb8 /chosen ffe1df58 /builtin ffe1e080 /builtin/console ffe26e20 /packages ffe28bf8 /packages/cmdline ffe28e48 /packages/disk-label ffe2a9a8 /packages/deblocker ffe2afc0 /packages/grubfs-files ffe2b3d0 /packages/sun-parts ffe2b7e8 /packages/elf-loader ffe30dc0 /packages/ufs-file-system ffe29a28 /memory@0,0 (memory) ffe29b88 /virtual-memory ffe2b948 /pci@1fe,0 (pci) ffe2c1b0 /pci@1fe,0/pci@1 (pci) ffe2c868 /pci@1fe,0/pci@1,1 (pci) ffe2cef8 /pci@1fe,0/QEMU,VGA@2 (display) ffe2d6c0 /pci@1fe,0/ebus@3 ffe2ddc0 /pci@1fe,0/ebus@3/fdthree@0 (block) ffe2e308 /pci@1fe,0/ebus@3/su@0,1 (serial) ffe2e608 /pci@1fe,0/ebus@3/kb_ps2@0 (serial) ffe2e938 /pci@1fe,0/NE2000@4 (network) ffe2ef20 /pci@1fe,0/pci-ata@5 (pci-ide) ffe2f518 /pci@1fe,0/pci-ata@5/ide0@500 (ide) ffe2f7b0 /pci@1fe,0/pci-ata@5/ide1@600 (ide) ffe2fa48 /pci@1fe,0/pci-ata@5/ide1@600/cdrom@0 (block) ffe300d0 /SUNW,UltraSPARC-IIi (cpu) ffe307b8 /SUNW,UltraSPARC-IIi/mmu ok 0 > ffe88838 ihandle>phandle u. ffe29a28 ok 0 > ffe880f8 ihandle>phandle u. ffe307b8 ok 0 > cd /SUNW,UltraSPARC-IIi/mmu ok 0 > .properties name "mmu" ok 0 >
So the /memory and /virtual-memory nodes look correct - but the "mmu" property in /chosen is set to point to "/SUNW,UltraSPARC-IIi/mmu" which doesn't actually have any properties set!
Does anyone know what this second CPU MMU node is used for? Should the /chosen "mmu" property point to /virtual-memory instead? Or should this node be an exact copy of /virtual-memory?
ATB,
Mark.
Mark Cave-Ayland wrote:
Does anyone know what this second CPU MMU node is used for? Should the /chosen "mmu" property point to /virtual-memory instead? Or should this node be an exact copy of /virtual-memory?
So I moved all of the MMU methods under /virtual-memory and changed the /chosen "mmu" property to point there, and now both Milax OpenSolaris and Solaris 9 get further:
Solaris 9:
finddevice("cdrom:f") = 0xffe2f9e0 getproplen(0xffe2f9e0, "device_type") = 0x00000006 getproplen(0xffe2f9e0, "device_type") = 0x00000006 getprop(0xffe2f9e0, "device_type", 0x00133878, 6) = service getprop: possible argument error (0 1) 6 0x00133878 62 6c 6f 63 6b 00 __ __ __ __ __ __ __ __ __ __ block. open("cdrom:f") = 0x00000000 getproplen(0xffe1dd98, "stdout") = 0x00000004 getproplen(0xffe1dd98, "stdout") = 0x00000004 getprop(0xffe1dd98, "stdout", 0x001376f8, 4) = service getprop: possible argument error (0 1) 4 0x001376f8 ff e8 83 b0 __ __ __ __ __ __ __ __ __ __ __ __ ��.� Cannot open cdrom:f panic - boot: ufsboot: cannot determine filesystem type of root device. exit() EXIT -1 >
- Looks like the CIF open of "cdrom:f" fails which is strange because it works fine from the Forth prompt.
Milax:
interpret 'h# 70000000000 constant kmem64-base h# 70000252000 constant kmem64-end h# ffffffffffc00000 constant kmem64-pagemask h# e0000000064016b6 constant kmem64-template : kmem64-tte ( addr cnum -- false | tte-data true ) if ( addr ) drop false exit then ( false ) dup kmem64-base kmem64-end within if ( addr ) kmem64-pagemask and ( addr' ) kmem64-base - ( addr' ) kmem64-template + handle_calls return: Unhandled Exception 0x0000070000242000 PC = 0x000000000101eb3c NPC = 0x000000000101eb40 Stopping execution
- Now this is very interesting. Looks like we are missing a Forth pgmap@ function which according to the OF spec does this:
pgmap@ ( virt -- pme ) The page map entry pme corresponds to the virtual address virt.
I think that this requires some kind of Forth wrapper around one of the OFMEM functions to pull out a TTE?
ATB,
Mark.