QEMU mac99 emulates two of the three PCI buses found on real PowerMac3,1 but OpenBIOS currently only handles a single PCI bus and inits and puts info in the device tree of the second PCI bus only (which is where devices are connected). However, some clients (e.g. MorphOS) may have hardcoded assumptions and erroneously use the address of the first bus to access PCI config registers for devices on the second bus if the first bus is missing from the device tree, which silently fails as these requests will go to the other empty bus emulated and return invalid values as the device they address are not present there.
As a result devices mapped via MMIO still appear to work but they may not be correctly initialised and some cards are not detected because of this. One such case might be enabling bus master bit for network cards which the OS should do but OpenBIOS has workaround for it now. Once both PCI buses appear in device tree those workarounds may not be needed any more.
Until proper support for multiple PCI buses is implemented add an empty node in the device tree for the first bus on QEMU mac99 to let OSes know about it. This fixes detecting PCI devices (such as USB) under MorphOS and allows it to boot.
Signed-off-by: BALATON Zoltan balaton@eik.bme.hu --- arch/ppc/qemu/init.c | 55 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+)
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index 45cd77e..b1c2197 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -716,6 +716,59 @@ static void kvm_of_init(void) fword("finish-device"); }
+static void encode_int_plus(int n, ...) +{ + int i; + ucell v; + va_list ap; + + va_start(ap, n); + for (i = 0; i < n; i++) { + v = va_arg(ap, ucell); + PUSH(v); + fword("encode-int"); + if (i > 0) { + fword("encode+"); + } + } + va_end(ap); +} + +static void empty_pci_bus_init(void) +{ + if (machine_id == ARCH_MAC99) { + fword("new-device"); + push_str("pci"); + fword("device-name"); + push_str("pci"); + fword("device-type"); + encode_int_plus(2, 0xf0000000, 0x02000000); + push_str("reg"); + fword("property"); + PUSH(3); + fword("encode-int"); + push_str("#address-cells"); + fword("property"); + PUSH(2); + fword("encode-int"); + push_str("#size-cells"); + fword("property"); + PUSH(1); + fword("encode-int"); + push_str("#interrupt-cells"); + fword("property"); + encode_int_plus(12, + 0x01000000, 0, 0, 0xf0000000, 0, 0x00800000, + 0x02000000, 0, 0x90000000, 0x90000000, 0, 0x10000000); + push_str("ranges"); + fword("property"); + encode_int_plus(2, 0, 0); + push_str("bus-range"); + fword("property"); + fword("finish-device"); + } +} + /* * filll ( addr bytes quad -- ) */ @@ -868,6 +921,8 @@ arch_of_init(void) case ARCH_MAC99_U3: /* The NewWorld NVRAM is not located in the MacIO device */ macio_nvram_init("/", 0); + /* We only handle 1 PCI bus but MorphOS needs info for both to boot */ + empty_pci_bus_init(); ob_pci_init(); ob_unin_init(); break;
On 16/10/2021 17:17, BALATON Zoltan wrote:
QEMU mac99 emulates two of the three PCI buses found on real PowerMac3,1 but OpenBIOS currently only handles a single PCI bus and inits and puts info in the device tree of the second PCI bus only (which is where devices are connected). However, some clients (e.g. MorphOS) may have hardcoded assumptions and erroneously use the address of the first bus to access PCI config registers for devices on the second bus if the first bus is missing from the device tree, which silently fails as these requests will go to the other empty bus emulated and return invalid values as the device they address are not present there.
As a result devices mapped via MMIO still appear to work but they may not be correctly initialised and some cards are not detected because of this. One such case might be enabling bus master bit for network cards which the OS should do but OpenBIOS has workaround for it now. Once both PCI buses appear in device tree those workarounds may not be needed any more.
Until proper support for multiple PCI buses is implemented add an empty node in the device tree for the first bus on QEMU mac99 to let OSes know about it. This fixes detecting PCI devices (such as USB) under MorphOS and allows it to boot.
Signed-off-by: BALATON Zoltan balaton@eik.bme.hu
arch/ppc/qemu/init.c | 55 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+)
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index 45cd77e..b1c2197 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -716,6 +716,59 @@ static void kvm_of_init(void) fword("finish-device"); }
+static void encode_int_plus(int n, ...) +{
- int i;
- ucell v;
- va_list ap;
- va_start(ap, n);
- for (i = 0; i < n; i++) {
v = va_arg(ap, ucell);
PUSH(v);
fword("encode-int");
if (i > 0) {
fword("encode+");
}
- }
- va_end(ap);
+}
+static void empty_pci_bus_init(void) +{
- if (machine_id == ARCH_MAC99) {
fword("new-device");
push_str("pci");
fword("device-name");
push_str("pci");
fword("device-type");
encode_int_plus(2, 0xf0000000, 0x02000000);
push_str("reg");
fword("property");
PUSH(3);
fword("encode-int");
push_str("#address-cells");
fword("property");
PUSH(2);
fword("encode-int");
push_str("#size-cells");
fword("property");
PUSH(1);
fword("encode-int");
push_str("#interrupt-cells");
fword("property");
encode_int_plus(12,
0x01000000, 0, 0, 0xf0000000, 0, 0x00800000,
0x02000000, 0, 0x90000000, 0x90000000, 0, 0x10000000);
push_str("ranges");
fword("property");
encode_int_plus(2, 0, 0);
push_str("bus-range");
fword("property");
fword("finish-device");
- }
+}
- /*
*/
- filll ( addr bytes quad -- )
@@ -868,6 +921,8 @@ arch_of_init(void) case ARCH_MAC99_U3: /* The NewWorld NVRAM is not located in the MacIO device */ macio_nvram_init("/", 0);
/* We only handle 1 PCI bus but MorphOS needs info for both to boot */
empty_pci_bus_init(); ob_pci_init(); ob_unin_init(); break;
I've run through the CDROM boot tests and unfortunately this version causes the NetBSD kernel to be unable to locate its boot disk on my NetBSD 8.0 test ISO:
$ ./qemu-system-ppc -m 512 -bios openbios-qemu.elf.nostrip -M mac99,via=pmu -cdrom NetBSD-8.0-macppc.iso -boot d -prom-env 'boot-device=cd:,\ofwboot.xcf' -nographic
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules /: bad dir ino 10 at offset 0: null entry exec /sbin/init: error 2 init: trying /sbin/oinit /: bad dir ino 10 at offset 0: null entry exec /sbin/oinit: error 2 init: trying /sbin/init.bak /: bad dir ino 10 at offset 0: null entry exec /sbin/init.bak: error 2 init: trying /rescue/init exec /rescue/init: error 2 init path (default /sbin/init):
Removing -bios allows the CDROM to boot to the terminal prompt as normal:
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules erase ^H, werase ^W, kill ^U, intr ^C, status ^T Terminal type? [vt100]
ATB,
Mark.
On Sat, 23 Oct 2021, Mark Cave-Ayland wrote:
I've run through the CDROM boot tests and unfortunately this version causes the NetBSD kernel to be unable to locate its boot disk on my NetBSD 8.0 test ISO:
$ ./qemu-system-ppc -m 512 -bios openbios-qemu.elf.nostrip -M mac99,via=pmu -cdrom NetBSD-8.0-macppc.iso -boot d -prom-env 'boot-device=cd:,\ofwboot.xcf' -nographic
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules /: bad dir ino 10 at offset 0: null entry exec /sbin/init: error 2 init: trying /sbin/oinit /: bad dir ino 10 at offset 0: null entry exec /sbin/oinit: error 2 init: trying /sbin/init.bak /: bad dir ino 10 at offset 0: null entry exec /sbin/init.bak: error 2 init: trying /rescue/init exec /rescue/init: error 2 init path (default /sbin/init):
Removing -bios allows the CDROM to boot to the terminal prompt as normal:
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules erase ^H, werase ^W, kill ^U, intr ^C, status ^T Terminal type? [vt100]
I could reproduce this but have no idea what is happening. Looks like it cannot read from CD so I suspected some problem with DMA but I could not find any traces that show a difference. I've tried pci, unin, macio, ide but these don't seem to show any change. I'm not sure where to look for further clues, I could not find in NetBSD sources where it reads OF device tree and how it uses it. Do you have any idea what to look for and where?
Regards. BALATON Zoltan
On 23/10/2021 21:08, BALATON Zoltan wrote:
On Sat, 23 Oct 2021, Mark Cave-Ayland wrote:
I've run through the CDROM boot tests and unfortunately this version causes the NetBSD kernel to be unable to locate its boot disk on my NetBSD 8.0 test ISO:
$ ./qemu-system-ppc -m 512 -bios openbios-qemu.elf.nostrip -M mac99,via=pmu -cdrom NetBSD-8.0-macppc.iso -boot d -prom-env 'boot-device=cd:,\ofwboot.xcf' -nographic
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules /: bad dir ino 10 at offset 0: null entry exec /sbin/init: error 2 init: trying /sbin/oinit /: bad dir ino 10 at offset 0: null entry exec /sbin/oinit: error 2 init: trying /sbin/init.bak /: bad dir ino 10 at offset 0: null entry exec /sbin/init.bak: error 2 init: trying /rescue/init exec /rescue/init: error 2 init path (default /sbin/init):
Removing -bios allows the CDROM to boot to the terminal prompt as normal:
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules erase ^H, werase ^W, kill ^U, intr ^C, status ^T Terminal type? [vt100]
I could reproduce this but have no idea what is happening. Looks like it cannot read from CD so I suspected some problem with DMA but I could not find any traces that show a difference. I've tried pci, unin, macio, ide but these don't seem to show any change. I'm not sure where to look for further clues, I could not find in NetBSD sources where it reads OF device tree and how it uses it. Do you have any idea what to look for and where?
Unfortunately I'm not too familiar with NetBSD sources but clearly the change is affecting something in the way it configures the hardware. Perhaps try asking over on one of the NetBSD mailing lists such as port-macppc?
ATB,
Mark.
On Sun, 24 Oct 2021, Mark Cave-Ayland wrote:
On 23/10/2021 21:08, BALATON Zoltan wrote:
On Sat, 23 Oct 2021, Mark Cave-Ayland wrote:
I've run through the CDROM boot tests and unfortunately this version causes the NetBSD kernel to be unable to locate its boot disk on my NetBSD 8.0 test ISO:
$ ./qemu-system-ppc -m 512 -bios openbios-qemu.elf.nostrip -M mac99,via=pmu -cdrom NetBSD-8.0-macppc.iso -boot d -prom-env 'boot-device=cd:,\ofwboot.xcf' -nographic
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules /: bad dir ino 10 at offset 0: null entry exec /sbin/init: error 2 init: trying /sbin/oinit /: bad dir ino 10 at offset 0: null entry exec /sbin/oinit: error 2 init: trying /sbin/init.bak /: bad dir ino 10 at offset 0: null entry exec /sbin/init.bak: error 2 init: trying /rescue/init exec /rescue/init: error 2 init path (default /sbin/init):
Removing -bios allows the CDROM to boot to the terminal prompt as normal:
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules erase ^H, werase ^W, kill ^U, intr ^C, status ^T Terminal type? [vt100]
I could reproduce this but have no idea what is happening. Looks like it cannot read from CD so I suspected some problem with DMA but I could not find any traces that show a difference. I've tried pci, unin, macio, ide but these don't seem to show any change. I'm not sure where to look for further clues, I could not find in NetBSD sources where it reads OF device tree and how it uses it. Do you have any idea what to look for and where?
Unfortunately I'm not too familiar with NetBSD sources but clearly the change is affecting something in the way it configures the hardware. Perhaps try asking over on one of the NetBSD mailing lists such as port-macppc?
I've found in NetBSD man pages (https://man.netbsd.org/boot.8) that additional debug messages can be enabled with boot -x kernel which I think is what I need to get more info on what's happening but I couldn't yet figure out how to pass options to kernel or how to get an interactive boot prompt from the boot loader to add this -x option.
Regards, BALATON Zoltan
On Sun, 24 Oct 2021, BALATON Zoltan wrote:
On Sun, 24 Oct 2021, Mark Cave-Ayland wrote:
On 23/10/2021 21:08, BALATON Zoltan wrote:
On Sat, 23 Oct 2021, Mark Cave-Ayland wrote:
I've run through the CDROM boot tests and unfortunately this version causes the NetBSD kernel to be unable to locate its boot disk on my NetBSD 8.0 test ISO:
$ ./qemu-system-ppc -m 512 -bios openbios-qemu.elf.nostrip -M mac99,via=pmu -cdrom NetBSD-8.0-macppc.iso -boot d -prom-env 'boot-device=cd:,\ofwboot.xcf' -nographic
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules /: bad dir ino 10 at offset 0: null entry exec /sbin/init: error 2 init: trying /sbin/oinit /: bad dir ino 10 at offset 0: null entry exec /sbin/oinit: error 2 init: trying /sbin/init.bak /: bad dir ino 10 at offset 0: null entry exec /sbin/init.bak: error 2 init: trying /rescue/init exec /rescue/init: error 2 init path (default /sbin/init):
Removing -bios allows the CDROM to boot to the terminal prompt as normal:
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules erase ^H, werase ^W, kill ^U, intr ^C, status ^T Terminal type? [vt100]
I could reproduce this but have no idea what is happening. Looks like it cannot read from CD so I suspected some problem with DMA but I could not find any traces that show a difference. I've tried pci, unin, macio, ide but these don't seem to show any change. I'm not sure where to look for further clues, I could not find in NetBSD sources where it reads OF device tree and how it uses it. Do you have any idea what to look for and where?
Unfortunately I'm not too familiar with NetBSD sources but clearly the change is affecting something in the way it configures the hardware. Perhaps try asking over on one of the NetBSD mailing lists such as port-macppc?
I've found in NetBSD man pages (https://man.netbsd.org/boot.8) that additional debug messages can be enabled with boot -x kernel which I think is what I need to get more info on what's happening but I couldn't yet figure out how to pass options to kernel or how to get an interactive boot prompt from the boot loader to add this -x option.
There are some more man pages: https://man.netbsd.org/macppc/boot.8 https://man.netbsd.org/macppc/ofwboot.8
I try to get some mode detailed logs.
Regards, BALATON Zoltan
On Sun, 24 Oct 2021, BALATON Zoltan wrote:
On Sun, 24 Oct 2021, BALATON Zoltan wrote:
On Sun, 24 Oct 2021, Mark Cave-Ayland wrote:
On 23/10/2021 21:08, BALATON Zoltan wrote:
On Sat, 23 Oct 2021, Mark Cave-Ayland wrote:
I've run through the CDROM boot tests and unfortunately this version causes the NetBSD kernel to be unable to locate its boot disk on my NetBSD 8.0 test ISO:
$ ./qemu-system-ppc -m 512 -bios openbios-qemu.elf.nostrip -M mac99,via=pmu -cdrom NetBSD-8.0-macppc.iso -boot d -prom-env 'boot-device=cd:,\ofwboot.xcf' -nographic
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules /: bad dir ino 10 at offset 0: null entry exec /sbin/init: error 2 init: trying /sbin/oinit /: bad dir ino 10 at offset 0: null entry exec /sbin/oinit: error 2 init: trying /sbin/init.bak /: bad dir ino 10 at offset 0: null entry exec /sbin/init.bak: error 2 init: trying /rescue/init exec /rescue/init: error 2 init path (default /sbin/init):
Removing -bios allows the CDROM to boot to the terminal prompt as normal:
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules erase ^H, werase ^W, kill ^U, intr ^C, status ^T Terminal type? [vt100]
I could reproduce this but have no idea what is happening. Looks like it cannot read from CD so I suspected some problem with DMA but I could not find any traces that show a difference. I've tried pci, unin, macio, ide but these don't seem to show any change. I'm not sure where to look for further clues, I could not find in NetBSD sources where it reads OF device tree and how it uses it. Do you have any idea what to look for and where?
Unfortunately I'm not too familiar with NetBSD sources but clearly the change is affecting something in the way it configures the hardware. Perhaps try asking over on one of the NetBSD mailing lists such as port-macppc?
I've found in NetBSD man pages (https://man.netbsd.org/boot.8) that additional debug messages can be enabled with boot -x kernel which I think is what I need to get more info on what's happening but I couldn't yet figure out how to pass options to kernel or how to get an interactive boot prompt from the boot loader to add this -x option.
There are some more man pages: https://man.netbsd.org/macppc/boot.8 https://man.netbsd.org/macppc/ofwboot.8
I try to get some mode detailed logs.
I could not get more verbose messages and ultimately could not figure out why this fails but this is what I've found. The current NetBSD 9.2 version is not affected and boots with or without the patch so that's good news. Previous versions mostly don't boot even before the patch (I wonder if these were tested on real hardware and they boot there at all). The two versions that booted before this patch (8.0 and 9.0) fail with different problems, 9.0 fails to find openpic and 8.0 I have no idea at all. I still don't understand what these NetBSD versions might need in the added pci node but I've found that MorphOS does not actually care what's in the node for the first bus or even what it's called. It just seems to look for a device-type pci node before the actual pci bus with the devices and probably just discards the first node it finds. So adding just a dummy node with pci device-type _before_ the /pci node makes MorphOS boot and I hope this won't break any other guests now so this would be an acceptable work around until proper support for multiple PCI buses would be implemented. I've sent a new patch with this, please test and let me know if there are any guests in your test images that still don't like this.
Regards, BALATON Zoltan
On 24/10/2021 21:02, BALATON Zoltan wrote:
On Sun, 24 Oct 2021, BALATON Zoltan wrote:
On Sun, 24 Oct 2021, BALATON Zoltan wrote:
On Sun, 24 Oct 2021, Mark Cave-Ayland wrote:
On 23/10/2021 21:08, BALATON Zoltan wrote:
On Sat, 23 Oct 2021, Mark Cave-Ayland wrote:
I've run through the CDROM boot tests and unfortunately this version causes the NetBSD kernel to be unable to locate its boot disk on my NetBSD 8.0 test ISO:
$ ./qemu-system-ppc -m 512 -bios openbios-qemu.elf.nostrip -M mac99,via=pmu -cdrom NetBSD-8.0-macppc.iso -boot d -prom-env 'boot-device=cd:,\ofwboot.xcf' -nographic
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules /: bad dir ino 10 at offset 0: null entry exec /sbin/init: error 2 init: trying /sbin/oinit /: bad dir ino 10 at offset 0: null entry exec /sbin/oinit: error 2 init: trying /sbin/init.bak /: bad dir ino 10 at offset 0: null entry exec /sbin/init.bak: error 2 init: trying /rescue/init exec /rescue/init: error 2 init path (default /sbin/init):
Removing -bios allows the CDROM to boot to the terminal prompt as normal:
... ... atapibus0 at atabus1: 2 targets cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom removable uhidev1 at uhub0 port 2 configuration 1 interface 0 uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, iclass 3/1 uhid at uhidev1 not configured WARNING: 2 errors while detecting hardware; check system log. boot device: <unknown> root on md0a dumps on md0b root file system type: ffs kern.module.path=/stand/macppc/8.0/modules erase ^H, werase ^W, kill ^U, intr ^C, status ^T Terminal type? [vt100]
I could reproduce this but have no idea what is happening. Looks like it cannot read from CD so I suspected some problem with DMA but I could not find any traces that show a difference. I've tried pci, unin, macio, ide but these don't seem to show any change. I'm not sure where to look for further clues, I could not find in NetBSD sources where it reads OF device tree and how it uses it. Do you have any idea what to look for and where?
Unfortunately I'm not too familiar with NetBSD sources but clearly the change is affecting something in the way it configures the hardware. Perhaps try asking over on one of the NetBSD mailing lists such as port-macppc?
I've found in NetBSD man pages (https://man.netbsd.org/boot.8) that additional debug messages can be enabled with boot -x kernel which I think is what I need to get more info on what's happening but I couldn't yet figure out how to pass options to kernel or how to get an interactive boot prompt from the boot loader to add this -x option.
There are some more man pages: https://man.netbsd.org/macppc/boot.8 https://man.netbsd.org/macppc/ofwboot.8
I try to get some mode detailed logs.
I could not get more verbose messages and ultimately could not figure out why this fails but this is what I've found. The current NetBSD 9.2 version is not affected and boots with or without the patch so that's good news. Previous versions mostly don't boot even before the patch (I wonder if these were tested on real hardware and they boot there at all). The two versions that booted before this patch (8.0 and 9.0) fail with different problems, 9.0 fails to find openpic and 8.0 I have no idea at all. I still don't understand what these NetBSD versions might need in the added pci node but I've found that MorphOS does not actually care what's in the node for the first bus or even what it's called. It just seems to look for a device-type pci node before the actual pci bus with the devices and probably just discards the first node it finds. So adding just a dummy node with pci device-type _before_ the /pci node makes MorphOS boot and I hope this won't break any other guests now so this would be an acceptable work around until proper support for multiple PCI buses would be implemented. I've sent a new patch with this, please test and let me know if there are any guests in your test images that still don't like this.
This means we've gone full circle, no? When you pinged this thread recently, I said I didn't mind merging a patch that added the correct /pci node from a real machine. The latest version of your patch goes back to adding a /dummy-pci node which is moving us away from where we want to be: the aim should be to move towards a real device tree, rather than adding synthetic nodes to work around bugs in various OSs.
The advantage of this approach is that once you get the new /pci node correct then everything will just work, since the device tree is the same both on real hardware and in QEMU.
What was the response from the NetBSD port-macppc mailing list? The NetBSD guys use QEMU a lot as part of their CI/release engineering process and are generally very responsive and knowledgeable: at the very minimum they will be able to point you directly towards the code in question.
ATB,
Mark.
On Tue, 26 Oct 2021, Mark Cave-Ayland wrote:
On 24/10/2021 21:02, BALATON Zoltan wrote:
On Sun, 24 Oct 2021, BALATON Zoltan wrote:
On Sun, 24 Oct 2021, BALATON Zoltan wrote:
On Sun, 24 Oct 2021, Mark Cave-Ayland wrote:
On 23/10/2021 21:08, BALATON Zoltan wrote:
On Sat, 23 Oct 2021, Mark Cave-Ayland wrote: > I've run through the CDROM boot tests and unfortunately this version > causes the NetBSD kernel to be unable to locate its boot disk on my > NetBSD 8.0 test ISO: > > $ ./qemu-system-ppc -m 512 -bios openbios-qemu.elf.nostrip -M > mac99,via=pmu -cdrom NetBSD-8.0-macppc.iso -boot d -prom-env > 'boot-device=cd:,\ofwboot.xcf' -nographic > > ... > ... > atapibus0 at atabus1: 2 targets > cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom > removable > uhidev1 at uhub0 port 2 configuration 1 interface 0 > uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, > iclass 3/1 > uhid at uhidev1 not configured > WARNING: 2 errors while detecting hardware; check system log. > boot device: <unknown> > root on md0a dumps on md0b > root file system type: ffs > kern.module.path=/stand/macppc/8.0/modules > /: bad dir ino 10 at offset 0: null entry > exec /sbin/init: error 2 > init: trying /sbin/oinit > /: bad dir ino 10 at offset 0: null entry > exec /sbin/oinit: error 2 > init: trying /sbin/init.bak > /: bad dir ino 10 at offset 0: null entry > exec /sbin/init.bak: error 2 > init: trying /rescue/init > exec /rescue/init: error 2 > init path (default /sbin/init): > > > Removing -bios allows the CDROM to boot to the terminal prompt as > normal: > > ... > ... > atapibus0 at atabus1: 2 targets > cd0 at atapibus0 drive 0: <QEMU DVD-ROM, QM00003, 2.5+> cdrom > removable > uhidev1 at uhub0 port 2 configuration 1 interface 0 > uhidev1: QEMU (0x627) QEMU USB Mouse (0x01), rev 2.00/0.00, addr 3, > iclass 3/1 > uhid at uhidev1 not configured > WARNING: 2 errors while detecting hardware; check system log. > boot device: <unknown> > root on md0a dumps on md0b > root file system type: ffs > kern.module.path=/stand/macppc/8.0/modules > erase ^H, werase ^W, kill ^U, intr ^C, status ^T > Terminal type? [vt100]
I could reproduce this but have no idea what is happening. Looks like it cannot read from CD so I suspected some problem with DMA but I could not find any traces that show a difference. I've tried pci, unin, macio, ide but these don't seem to show any change. I'm not sure where to look for further clues, I could not find in NetBSD sources where it reads OF device tree and how it uses it. Do you have any idea what to look for and where?
Unfortunately I'm not too familiar with NetBSD sources but clearly the change is affecting something in the way it configures the hardware. Perhaps try asking over on one of the NetBSD mailing lists such as port-macppc?
I've found in NetBSD man pages (https://man.netbsd.org/boot.8) that additional debug messages can be enabled with boot -x kernel which I think is what I need to get more info on what's happening but I couldn't yet figure out how to pass options to kernel or how to get an interactive boot prompt from the boot loader to add this -x option.
There are some more man pages: https://man.netbsd.org/macppc/boot.8 https://man.netbsd.org/macppc/ofwboot.8
I try to get some mode detailed logs.
I could not get more verbose messages and ultimately could not figure out why this fails but this is what I've found. The current NetBSD 9.2 version is not affected and boots with or without the patch so that's good news. Previous versions mostly don't boot even before the patch (I wonder if these were tested on real hardware and they boot there at all). The two versions that booted before this patch (8.0 and 9.0) fail with different problems, 9.0 fails to find openpic and 8.0 I have no idea at all. I still don't understand what these NetBSD versions might need in the added pci node but I've found that MorphOS does not actually care what's in the node for the first bus or even what it's called. It just seems to look for a device-type pci node before the actual pci bus with the devices and probably just discards the first node it finds. So adding just a dummy node with pci device-type _before_ the /pci node makes MorphOS boot and I hope this won't break any other guests now so this would be an acceptable work around until proper support for multiple PCI buses would be implemented. I've sent a new patch with this, please test and let me know if there are any guests in your test images that still don't like this.
This means we've gone full circle, no? When you pinged this thread recently, I said I didn't mind merging a patch that added the correct /pci node from a real machine. The latest version of your patch goes back to adding a /dummy-pci node which is moving us away from where we want to be: the aim should be to move towards a real device tree, rather than adding synthetic nodes to work around bugs in various OSs.
My aim is to let the most guests boot and adding the /pci node btoke two NetBSD versions while not adding any node breaks MorphOS. This patch allows all three to boot and it's the most unlikely to break anything else. I was trying to fix the /pci node but I could not find out what NetBSD needs so I went for solving the actual problem instead which is to let all three of these boot and for that this is the simplest solution.
The advantage of this approach is that once you get the new /pci node correct then everything will just work, since the device tree is the same both on real hardware and in QEMU.
That's the theory but mac99 does not emulate the real machine 100% so there could be other differences why an identical device tree may not work. For one we don't emulate the pci-bridge (QEMU could do it but OpenBIOS cannot deal with that so it's disabled. This makes the current /pci node already different). We also don't really emulate the AGP port and just have a PCI instead so maybe this /pci node I'm trying to add can't be the same either. It would take fixing QEMU to emulate everything faithfully and make OpenBIOS do everything what OpenFirmware does to arrive to an identical state which is unreasonable. Not only in the week left until the freeze but all these years I'm trying to fix this MorphOS problem so that won't happen.
What was the response from the NetBSD port-macppc mailing list? The NetBSD guys use QEMU a lot as part of their CI/release engineering process and are generally very responsive and knowledgeable: at the very minimum they will be able to point you directly towards the code in question.
I don't know. Are you subscribed to that list or know their IRC channel you could ask them faster? Please cc me the answer. I've tried my best to find this out but I don't have more time to spend with this now.
Regards, BALATON Zoltan
On Tue, 26 Oct 2021, BALATON Zoltan wrote:
On Tue, 26 Oct 2021, Mark Cave-Ayland wrote:
What was the response from the NetBSD port-macppc mailing list? The NetBSD guys use QEMU a lot as part of their CI/release engineering process and are generally very responsive and knowledgeable: at the very minimum they will be able to point you directly towards the code in question.
I don't know. Are you subscribed to that list or know their IRC channel you could ask them faster? Please cc me the answer. I've tried my best to find this out but I don't have more time to spend with this now.
What will be done about this so that QEMU 6.2 won't be released in a state that MorphOS can't boot on mac99 any more? Looks like NetBSD list can't help finding the problem and I'm not able to debug it. (I think this may be an existing bug only exposed by this patch not caused by it based on the response we got so far. It looks like there's some overlap in memory usage for some reason which causes parts to be overwritten and adding another /pci node probably causes more stuff to be overwritten so it breaks but otherwise it probably just overwrites less critical stuff and still boots by chance. This is likely since most versions don't boot either before or after this patch.)
I think we have two options for the next QEMU release given that we only have one week left for bug fixes:
1. Take the original v8 patch adding the pci node with actual pci bus info and accept this breaks NetBSD 8.0 and 9,0 (but 9.2 still works).
or
2. Take the dummy-pci patch that does not break anything but fixes the MorphOS boot. (This may be a hack but it could be cleaned up later and does the job for now.)
What's your decision on this?
Regards, BALATON Zoltan
On 02/11/2021 15:39, BALATON Zoltan wrote:
On Tue, 26 Oct 2021, BALATON Zoltan wrote:
On Tue, 26 Oct 2021, Mark Cave-Ayland wrote:
What was the response from the NetBSD port-macppc mailing list? The NetBSD guys use QEMU a lot as part of their CI/release engineering process and are generally very responsive and knowledgeable: at the very minimum they will be able to point you directly towards the code in question.
I don't know. Are you subscribed to that list or know their IRC channel you could ask them faster? Please cc me the answer. I've tried my best to find this out but I don't have more time to spend with this now.
What will be done about this so that QEMU 6.2 won't be released in a state that MorphOS can't boot on mac99 any more? Looks like NetBSD list can't help finding the problem and I'm not able to debug it. (I think this may be an existing bug only exposed by this patch not caused by it based on the response we got so far. It looks like there's some overlap in memory usage for some reason which causes parts to be overwritten and adding another /pci node probably causes more stuff to be overwritten so it breaks but otherwise it probably just overwrites less critical stuff and still boots by chance. This is likely since most versions don't boot either before or after this patch.)
I think we have two options for the next QEMU release given that we only have one week left for bug fixes:
- Take the original v8 patch adding the pci node with actual pci bus info and accept
this breaks NetBSD 8.0 and 9,0 (but 9.2 still works).
or
- Take the dummy-pci patch that does not break anything but fixes the MorphOS boot.
(This may be a hack but it could be cleaned up later and does the job for now.)
What's your decision on this?
Firstly option 1 is not a viable choice because it would break booting NetBSD 8.0/9.0 for people that have existing disk images. But then option 2 is not suitable in that it moves the DT further away from that of a real Mac: the aim should be to move the emulation closer to real life rather than further from it. I also would not agree with your statement that the patch doesn't break anything, since the fact regressions caused by earlier versions of this patch have only been caught by my pre-merge tests highlight the problem of testing coverage. Introducing a bogus DT node is a highly visible change to introduce to the guest, and propagating such an entry into OSs and device trees is at the minimum going to cause confusion and at worst cause more regressions.
I had a look at the port-macppc archives and I saw the discussion around possible corruption but there didn't seem to be any further analysis or conclusive outcome. I spent some time this morning burning the NetBSD 8.0 ISO to a CD and booting it on my G4 Mac Mini and it booted into userspace fine. On this basis we know that a DT with multiple pci nodes boots fine with that QEMU version, so any bugs must be either in the extra /pci node contents or how the PCI bus is mapped for the mac99 machine.
For me the obvious answer is to use QEMU to determine why boot fails and add a standard /pci@f0000000 node as per your earlier patch that will then be guaranteed to work across all OSs, even ones that do not have existing test coverage, as we know this is what works on real hardware.
Attached is the dmesg from booting NetBSD 8.0 on my Mac Mini, along with a copy of the DTs for an iMac 99 and G4 AGP which are the closest DTs to the QEMU mac99 machine which I hope will be of help. I do understand how getting patches merged is always frustrating - it is something that happens to all of us - but once the issue with the /pci node has been fixed so that it is the same as a real Mac, compatibility becomes something we can simply forget about.
ATB,
Mark.
On Sat, 6 Nov 2021, Mark Cave-Ayland wrote:
On 02/11/2021 15:39, BALATON Zoltan wrote:
On Tue, 26 Oct 2021, BALATON Zoltan wrote:
On Tue, 26 Oct 2021, Mark Cave-Ayland wrote:
What was the response from the NetBSD port-macppc mailing list? The NetBSD guys use QEMU a lot as part of their CI/release engineering process and are generally very responsive and knowledgeable: at the very minimum they will be able to point you directly towards the code in question.
I don't know. Are you subscribed to that list or know their IRC channel you could ask them faster? Please cc me the answer. I've tried my best to find this out but I don't have more time to spend with this now.
What will be done about this so that QEMU 6.2 won't be released in a state that MorphOS can't boot on mac99 any more? Looks like NetBSD list can't help finding the problem and I'm not able to debug it. (I think this may be an existing bug only exposed by this patch not caused by it based on the response we got so far. It looks like there's some overlap in memory usage for some reason which causes parts to be overwritten and adding another /pci node probably causes more stuff to be overwritten so it breaks but otherwise it probably just overwrites less critical stuff and still boots by chance. This is likely since most versions don't boot either before or after this patch.)
I think we have two options for the next QEMU release given that we only have one week left for bug fixes:
- Take the original v8 patch adding the pci node with actual pci bus info
and accept this breaks NetBSD 8.0 and 9,0 (but 9.2 still works).
or
- Take the dummy-pci patch that does not break anything but fixes the
MorphOS boot. (This may be a hack but it could be cleaned up later and does the job for now.)
What's your decision on this?
Firstly option 1 is not a viable choice because it would break booting NetBSD 8.0/9.0 for people that have existing disk images. But then option 2 is not
I think this breakage is not caused by this patch but only exposed by it. Therefore it's also not possible to fix it in this patch so don't blame this patch but look for the problem elsewhere and fix it there. This should be no reason to withhold this patch but if it is I've provided an alternative with the dummy-pci that did not break this case but still fixed the problem I'd like to fix.
suitable in that it moves the DT further away from that of a real Mac: the aim should be to move the emulation closer to real life rather than further from it.
OK but then is there an option that is viable and possible to do for QEMU 6.2 release?
I also would not agree with your statement that the patch doesn't break anything, since the fact regressions caused by earlier versions of this patch have only been caught by my pre-merge tests highlight the problem of testing coverage.
It's very unlikely to break anything but I can't be sure for at least two reasons: you did not publish what images you test with so it's impossible for me to make sure your tests will pass but secondly even if it was public I don't have the time to do all this testing. You've volunteered to do that as a maintainer so please do your job and test the dummy-pci patch and only say it will break anything if you can show what. Don't discard it based on belief it might break something. That argument is true for any and all patches.
Introducing a bogus DT node is a highly visible change to introduce to the guest, and propagating such an entry into OSs and device trees is at the minimum going to cause confusion and at worst cause more regressions.
Confusion for whom and why should it cause regressions? I don't believe this.
I had a look at the port-macppc archives and I saw the discussion around possible corruption but there didn't seem to be any further analysis or conclusive outcome. I spent some time this morning burning the NetBSD 8.0 ISO to a CD and booting it on my G4 Mac Mini and it booted into userspace fine. On this basis we know that a DT with multiple pci nodes boots fine with that QEMU version, so any bugs must be either in the extra /pci node contents or how the PCI bus is mapped for the mac99 machine.
This test is not suffucient to prove this patch causes the problem. Do another test: take NetBSD 8.1 or 9.1 that currently does not boot on QEMU master and verify that boots on your Mac mini. If it does then I think that proves there's a bug elsewhere which just happen to be not serious enough without this patch for some NetBSD versions to fail but it's nevertheless there and this patch only exposes that. I see no way this patch could cause overwriting memory in itself unless there's some other problem existing without this patch already. That also means no ammount of changes to this patch will fix that problem.
For me the obvious answer is to use QEMU to determine why boot fails and add a standard /pci@f0000000 node as per your earlier patch that will then be guaranteed to work across all OSs, even ones that do not have existing test coverage, as we know this is what works on real hardware.
Attached is the dmesg from booting NetBSD 8.0 on my Mac Mini, along with a copy of the DTs for an iMac 99 and G4 AGP which are the closest DTs to the QEMU mac99 machine which I hope will be of help. I do understand how getting patches merged is always frustrating - it is something that happens to all of us - but once the issue with the /pci node has been fixed so that it is the same as a real Mac, compatibility becomes something we can simply forget about.
OK I've sent a v9 which completely recreates the /pci node matching the real hardware adding the so far missing properties but it makes no difference for NetBSD so it only proves your theory is wrong. Now you have 10 versions of this patch plus the dummy-pci one to chose from. All of these fixes the MorphOS boot problem and you can decide how many other guests you want to break or keep working but please take one of these and don't let users who use MorphOS with mac99 with broken machine in QEMU 6.2.
As a bonus I've also fixed compilation with gcc 10 but only because it was in the code I've contributed before. I'm at a point where it feels like wasting my time trying to contibute so I'm very close to give up on it.
Regards, BALATON Zoltan
On 06/11/2021 17:26, BALATON Zoltan wrote:
On Sat, 6 Nov 2021, Mark Cave-Ayland wrote:
On 02/11/2021 15:39, BALATON Zoltan wrote:
On Tue, 26 Oct 2021, BALATON Zoltan wrote:
On Tue, 26 Oct 2021, Mark Cave-Ayland wrote:
What was the response from the NetBSD port-macppc mailing list? The NetBSD guys use QEMU a lot as part of their CI/release engineering process and are generally very responsive and knowledgeable: at the very minimum they will be able to point you directly towards the code in question.
I don't know. Are you subscribed to that list or know their IRC channel you could ask them faster? Please cc me the answer. I've tried my best to find this out but I don't have more time to spend with this now.
What will be done about this so that QEMU 6.2 won't be released in a state that MorphOS can't boot on mac99 any more? Looks like NetBSD list can't help finding the problem and I'm not able to debug it. (I think this may be an existing bug only exposed by this patch not caused by it based on the response we got so far. It looks like there's some overlap in memory usage for some reason which causes parts to be overwritten and adding another /pci node probably causes more stuff to be overwritten so it breaks but otherwise it probably just overwrites less critical stuff and still boots by chance. This is likely since most versions don't boot either before or after this patch.)
I think we have two options for the next QEMU release given that we only have one week left for bug fixes:
- Take the original v8 patch adding the pci node with actual pci bus info and
accept this breaks NetBSD 8.0 and 9,0 (but 9.2 still works).
or
- Take the dummy-pci patch that does not break anything but fixes the MorphOS
boot. (This may be a hack but it could be cleaned up later and does the job for now.)
What's your decision on this?
Firstly option 1 is not a viable choice because it would break booting NetBSD 8.0/9.0 for people that have existing disk images. But then option 2 is not
I think this breakage is not caused by this patch but only exposed by it. Therefore it's also not possible to fix it in this patch so don't blame this patch but look for the problem elsewhere and fix it there. This should be no reason to withhold this patch but if it is I've provided an alternative with the dummy-pci that did not break this case but still fixed the problem I'd like to fix.
suitable in that it moves the DT further away from that of a real Mac: the aim should be to move the emulation closer to real life rather than further from it.
OK but then is there an option that is viable and possible to do for QEMU 6.2 release?
I also would not agree with your statement that the patch doesn't break anything, since the fact regressions caused by earlier versions of this patch have only been caught by my pre-merge tests highlight the problem of testing coverage.
It's very unlikely to break anything but I can't be sure for at least two reasons: you did not publish what images you test with so it's impossible for me to make sure your tests will pass but secondly even if it was public I don't have the time to do all this testing. You've volunteered to do that as a maintainer so please do your job and test the dummy-pci patch and only say it will break anything if you can show what. Don't discard it based on belief it might break something. That argument is true for any and all patches.
Introducing a bogus DT node is a highly visible change to introduce to the guest, and propagating such an entry into OSs and device trees is at the minimum going to cause confusion and at worst cause more regressions.
Confusion for whom and why should it cause regressions? I don't believe this.
I had a look at the port-macppc archives and I saw the discussion around possible corruption but there didn't seem to be any further analysis or conclusive outcome. I spent some time this morning burning the NetBSD 8.0 ISO to a CD and booting it on my G4 Mac Mini and it booted into userspace fine. On this basis we know that a DT with multiple pci nodes boots fine with that QEMU version, so any bugs must be either in the extra /pci node contents or how the PCI bus is mapped for the mac99 machine.
This test is not suffucient to prove this patch causes the problem. Do another test: take NetBSD 8.1 or 9.1 that currently does not boot on QEMU master and verify that boots on your Mac mini. If it does then I think that proves there's a bug elsewhere which just happen to be not serious enough without this patch for some NetBSD versions to fail but it's nevertheless there and this patch only exposes that. I see no way this patch could cause overwriting memory in itself unless there's some other problem existing without this patch already. That also means no ammount of changes to this patch will fix that problem.
For me the obvious answer is to use QEMU to determine why boot fails and add a standard /pci@f0000000 node as per your earlier patch that will then be guaranteed to work across all OSs, even ones that do not have existing test coverage, as we know this is what works on real hardware.
Attached is the dmesg from booting NetBSD 8.0 on my Mac Mini, along with a copy of the DTs for an iMac 99 and G4 AGP which are the closest DTs to the QEMU mac99 machine which I hope will be of help. I do understand how getting patches merged is always frustrating - it is something that happens to all of us - but once the issue with the /pci node has been fixed so that it is the same as a real Mac, compatibility becomes something we can simply forget about.
OK I've sent a v9 which completely recreates the /pci node matching the real hardware adding the so far missing properties but it makes no difference for NetBSD so it only proves your theory is wrong. Now you have 10 versions of this patch plus the dummy-pci one to chose from. All of these fixes the MorphOS boot problem and you can decide how many other guests you want to break or keep working but please take one of these and don't let users who use MorphOS with mac99 with broken machine in QEMU 6.2.
I do understand your frustration as I have been in similar situations myself where a patch I have written has exposed a bug somewhere else, and indeed this tends to happen quite a bit in QEMU development (or at least on the architectures I've worked with). It's not ideal, but unfortunately it happens and a single patch can turn into several patches. However both adding a bogus DT node which is visible to the guest or breaking existing guests are not the right options here for the reasons I've already explained.
My role as maintainer is to help guide the development of OpenBIOS and manage testing/merging as required. It is not my responsibility to perform exhaustive testing of every patch submitted because the submitter either doesn't want to, or doesn't has time to do it. Did you know that a complete set of boot tests for Mac takes over an hour? I've already tested several versions of your patch, and even performed boot tests on real hardware for comparison. As you can see I have already invested a huge amount of time into this including pointing you towards the NetBSD people and testing on real hardware, and I am also willing to get involved in further discussion on port-macppc where I can help.
Finally whilst you may have a different opinion as to the role of a maintainer, the tone of language used is not appropriate: please reduce the accusatory nature of your comments when posting to the mailing list.
As a bonus I've also fixed compilation with gcc 10 but only because it was in the code I've contributed before. I'm at a point where it feels like wasting my time trying to contibute so I'm very close to give up on it.
Thanks, I'll have a look - although since we are in freeze it's likely this will be deferred until after QEMU 6.2 is released.
ATB,
Mark.
On Sun, 7 Nov 2021, Mark Cave-Ayland wrote:
On 06/11/2021 17:26, BALATON Zoltan wrote:
On Sat, 6 Nov 2021, Mark Cave-Ayland wrote:
On 02/11/2021 15:39, BALATON Zoltan wrote:
On Tue, 26 Oct 2021, BALATON Zoltan wrote:
On Tue, 26 Oct 2021, Mark Cave-Ayland wrote:
What was the response from the NetBSD port-macppc mailing list? The NetBSD guys use QEMU a lot as part of their CI/release engineering process and are generally very responsive and knowledgeable: at the very minimum they will be able to point you directly towards the code in question.
I don't know. Are you subscribed to that list or know their IRC channel you could ask them faster? Please cc me the answer. I've tried my best to find this out but I don't have more time to spend with this now.
What will be done about this so that QEMU 6.2 won't be released in a state that MorphOS can't boot on mac99 any more? Looks like NetBSD list can't help finding the problem and I'm not able to debug it. (I think this may be an existing bug only exposed by this patch not caused by it based on the response we got so far. It looks like there's some overlap in memory usage for some reason which causes parts to be overwritten and adding another /pci node probably causes more stuff to be overwritten so it breaks but otherwise it probably just overwrites less critical stuff and still boots by chance. This is likely since most versions don't boot either before or after this patch.)
I think we have two options for the next QEMU release given that we only have one week left for bug fixes:
- Take the original v8 patch adding the pci node with actual pci bus
info and accept this breaks NetBSD 8.0 and 9,0 (but 9.2 still works).
or
- Take the dummy-pci patch that does not break anything but fixes the
MorphOS boot. (This may be a hack but it could be cleaned up later and does the job for now.)
What's your decision on this?
Firstly option 1 is not a viable choice because it would break booting NetBSD 8.0/9.0 for people that have existing disk images. But then option 2 is not
I think this breakage is not caused by this patch but only exposed by it. Therefore it's also not possible to fix it in this patch so don't blame this patch but look for the problem elsewhere and fix it there. This should be no reason to withhold this patch but if it is I've provided an alternative with the dummy-pci that did not break this case but still fixed the problem I'd like to fix.
suitable in that it moves the DT further away from that of a real Mac: the aim should be to move the emulation closer to real life rather than further from it.
OK but then is there an option that is viable and possible to do for QEMU 6.2 release?
I also would not agree with your statement that the patch doesn't break anything, since the fact regressions caused by earlier versions of this patch have only been caught by my pre-merge tests highlight the problem of testing coverage.
It's very unlikely to break anything but I can't be sure for at least two reasons: you did not publish what images you test with so it's impossible for me to make sure your tests will pass but secondly even if it was public I don't have the time to do all this testing. You've volunteered to do that as a maintainer so please do your job and test the dummy-pci patch and only say it will break anything if you can show what. Don't discard it based on belief it might break something. That argument is true for any and all patches.
Introducing a bogus DT node is a highly visible change to introduce to the guest, and propagating such an entry into OSs and device trees is at the minimum going to cause confusion and at worst cause more regressions.
Confusion for whom and why should it cause regressions? I don't believe this.
I had a look at the port-macppc archives and I saw the discussion around possible corruption but there didn't seem to be any further analysis or conclusive outcome. I spent some time this morning burning the NetBSD 8.0 ISO to a CD and booting it on my G4 Mac Mini and it booted into userspace fine. On this basis we know that a DT with multiple pci nodes boots fine with that QEMU version, so any bugs must be either in the extra /pci node contents or how the PCI bus is mapped for the mac99 machine.
This test is not suffucient to prove this patch causes the problem. Do another test: take NetBSD 8.1 or 9.1 that currently does not boot on QEMU master and verify that boots on your Mac mini. If it does then I think that proves there's a bug elsewhere which just happen to be not serious enough without this patch for some NetBSD versions to fail but it's nevertheless there and this patch only exposes that. I see no way this patch could cause overwriting memory in itself unless there's some other problem existing without this patch already. That also means no ammount of changes to this patch will fix that problem.
For me the obvious answer is to use QEMU to determine why boot fails and add a standard /pci@f0000000 node as per your earlier patch that will then be guaranteed to work across all OSs, even ones that do not have existing test coverage, as we know this is what works on real hardware.
Attached is the dmesg from booting NetBSD 8.0 on my Mac Mini, along with a copy of the DTs for an iMac 99 and G4 AGP which are the closest DTs to the QEMU mac99 machine which I hope will be of help. I do understand how getting patches merged is always frustrating - it is something that happens to all of us - but once the issue with the /pci node has been fixed so that it is the same as a real Mac, compatibility becomes something we can simply forget about.
OK I've sent a v9 which completely recreates the /pci node matching the real hardware adding the so far missing properties but it makes no difference for NetBSD so it only proves your theory is wrong. Now you have 10 versions of this patch plus the dummy-pci one to chose from. All of these fixes the MorphOS boot problem and you can decide how many other guests you want to break or keep working but please take one of these and don't let users who use MorphOS with mac99 with broken machine in QEMU 6.2.
I do understand your frustration as I have been in similar situations myself where a patch I have written has exposed a bug somewhere else, and indeed this tends to happen quite a bit in QEMU development (or at least on the architectures I've worked with). It's not ideal, but unfortunately it happens and a single patch can turn into several patches. However both adding a bogus DT node which is visible to the guest or breaking existing guests are not the right options here for the reasons I've already explained.
This is beyond the case where getting in a patch also needs fixing some more surrounding things like code style or little improvements to clean up the code I touch. That's frustrating but I'm OK with that to a point where the additional work is within limits and did that in QEMU several times. E.g. I had pegasos2 emulation working a year earlier but did the additional work to clean up VT82c686 emulation so it could be added cleanly which delayed it by a year due to the limited time I had for it. But in this case we are talking about just a few lines patch adding a dummy pci node that would fix MorphOS boot and you haven't shown yet it would break anything but you can't accept this and demand instead that I should implement support for multiple PCI buses in OpenBIOS and also fix QEMU to precisely emulate the hardware or fix bugs totally unrelated to the issue I'm trying to fix with this patch. This is not reasonable.
My role as maintainer is to help guide the development of OpenBIOS and manage testing/merging as required. It is not my responsibility to perform exhaustive testing of every patch submitted because the submitter either doesn't want to, or doesn't has time to do it. Did you know that a complete set of boot tests for Mac takes over an hour? I've already tested several versions of your patch, and even performed boot tests on real hardware for comparison. As you can see I have already invested a huge amount of time into this including pointing you towards the NetBSD people and testing on real hardware, and I am also willing to get involved in further discussion on port-macppc where I can help.
So did I, I've made several versions of this patch (also testing it with the images I have or the ones you've shown had problems) and trying to follow your advice while keeping within the limits I'm willing to invest in this but none of the attempts were good enough for you. And this is going on for 3 years. As a maintainer you should manage the project so those who want to use it get a working version and allow to make progress for those who want to contribute. There are a few people I know are using MorphOS on QEMU (maybe 5, not a big number but probably more than those who want to use NetBSD 8.0 and 9.0) and they will be left with their setup broken with the next QEMU release due to you're trying to preserve some old NetBSD versions booting which maybe nobody uses and only booted by chance so far. But there's also a simple fix to keep both these NetBSD versions work and fix MorphOS too but that's not acceptable for you either. This makes it impossible to fix the problem and stops all progress whereas if you took the simple fix with dummy pci that would solve the problem and not make the situation much worse than it is now and allow to make a small progress also allowing MorphOS to work. This is not such a big step backwards that's unacceptable and does not prevent cleaning it up later so I don't know what you can't accept this as a solution. Unless you can show this dummy pci breaks some other guest in your tests, but I doubt it would. This has escalated from a simple patch to fix a problem to a more complex patch adding the complete PCI node but that exposed bugs already present and caused problems for other guests. So what would the next step be? I've spent about a month debugging the original problem with MorphOS including following in a debugger why it could not talk to USB devices becuase its developers did not help at all. Then tried to find a solution to patch OpenBIOS to work around this problem, first from Forth then from C which took another few weeks. Then made countless versions on your requests trying to meet your demands which also took time, I've even fixed an issue with OpenSUSE that also took hours to get the image and test it and also tried to find out what causes the problem with NetBSD but at the end we're back where we were uears ago that the dummy pci version is the one which causes the least problems and fixes the issue. So the solution is simple and available for years yet it's impossible for users to get a working version without compiling their own patched version. Is that a good service as a maintainer?
Finally whilst you may have a different opinion as to the role of a maintainer, the tone of language used is not appropriate: please reduce the accusatory nature of your comments when posting to the mailing list.
I'm sorry but I'm really upset about this not being able to get over this problem for years and still don't see you can propose an alternative solution you'd accept just keep saying it's not acceptable for any patches I send trying to fix this. We have a problem booting MorphOS that I've debugged and identified that it could be fixed by adding a node to OpenBIOS one way or other and there are users out there who would be helped by this working in QEMU but I can't convince you that this problem is worth fixing. Why you can't accept a solution that fixes the problem and does not break other guests? Is it better for users to have clean code that does not work or a bit less clean but one that works? (And the OpenBIOS code is not that clean to begin with there are a lot of differences from real OpenFirmware so a bit more should not hurt as long as it works and the result is more guests booting.)
As a bonus I've also fixed compilation with gcc 10 but only because it was in the code I've contributed before. I'm at a point where it feels like wasting my time trying to contibute so I'm very close to give up on it.
Thanks, I'll have a look - although since we are in freeze it's likely this will be deferred until after QEMU 6.2 is released.
That's OK, however the MorphOS fix is something that can get in during the freeze and should be resolved. Please get one of the patches you're most happy with (or the least unhappy, I don't care) and get this solved at last.
Regards, BALATON Zoltan
On Sun, 7 Nov 2021, BALATON Zoltan wrote:
On Sun, 7 Nov 2021, Mark Cave-Ayland wrote:
On 06/11/2021 17:26, BALATON Zoltan wrote:
On Sat, 6 Nov 2021, Mark Cave-Ayland wrote:
On 02/11/2021 15:39, BALATON Zoltan wrote:
On Tue, 26 Oct 2021, BALATON Zoltan wrote:
On Tue, 26 Oct 2021, Mark Cave-Ayland wrote: > What was the response from the NetBSD port-macppc mailing list? The > NetBSD guys use QEMU a lot as part of their CI/release engineering > process and are generally very responsive and knowledgeable: at the > very minimum they will be able to point you directly towards the code > in question.
I don't know. Are you subscribed to that list or know their IRC channel you could ask them faster? Please cc me the answer. I've tried my best to find this out but I don't have more time to spend with this now.
What will be done about this so that QEMU 6.2 won't be released in a state that MorphOS can't boot on mac99 any more? Looks like NetBSD list can't help finding the problem and I'm not able to debug it. (I think this may be an existing bug only exposed by this patch not caused by it based on the response we got so far. It looks like there's some overlap in memory usage for some reason which causes parts to be overwritten and adding another /pci node probably causes more stuff to be overwritten so it breaks but otherwise it probably just overwrites less critical stuff and still boots by chance. This is likely since most versions don't boot either before or after this patch.)
I think we have two options for the next QEMU release given that we only have one week left for bug fixes:
- Take the original v8 patch adding the pci node with actual pci bus
info and accept this breaks NetBSD 8.0 and 9,0 (but 9.2 still works).
or
- Take the dummy-pci patch that does not break anything but fixes the
MorphOS boot. (This may be a hack but it could be cleaned up later and does the job for now.)
What's your decision on this?
Firstly option 1 is not a viable choice because it would break booting NetBSD 8.0/9.0 for people that have existing disk images. But then option 2 is not
I think this breakage is not caused by this patch but only exposed by it. Therefore it's also not possible to fix it in this patch so don't blame this patch but look for the problem elsewhere and fix it there. This should be no reason to withhold this patch but if it is I've provided an alternative with the dummy-pci that did not break this case but still fixed the problem I'd like to fix.
suitable in that it moves the DT further away from that of a real Mac: the aim should be to move the emulation closer to real life rather than further from it.
OK but then is there an option that is viable and possible to do for QEMU 6.2 release?
I also would not agree with your statement that the patch doesn't break anything, since the fact regressions caused by earlier versions of this patch have only been caught by my pre-merge tests highlight the problem of testing coverage.
It's very unlikely to break anything but I can't be sure for at least two reasons: you did not publish what images you test with so it's impossible for me to make sure your tests will pass but secondly even if it was public I don't have the time to do all this testing. You've volunteered to do that as a maintainer so please do your job and test the dummy-pci patch and only say it will break anything if you can show what. Don't discard it based on belief it might break something. That argument is true for any and all patches.
Introducing a bogus DT node is a highly visible change to introduce to the guest, and propagating such an entry into OSs and device trees is at the minimum going to cause confusion and at worst cause more regressions.
Confusion for whom and why should it cause regressions? I don't believe this.
I had a look at the port-macppc archives and I saw the discussion around possible corruption but there didn't seem to be any further analysis or conclusive outcome. I spent some time this morning burning the NetBSD 8.0 ISO to a CD and booting it on my G4 Mac Mini and it booted into userspace fine. On this basis we know that a DT with multiple pci nodes boots fine with that QEMU version, so any bugs must be either in the extra /pci node contents or how the PCI bus is mapped for the mac99 machine.
This test is not suffucient to prove this patch causes the problem. Do another test: take NetBSD 8.1 or 9.1 that currently does not boot on QEMU master and verify that boots on your Mac mini. If it does then I think that proves there's a bug elsewhere which just happen to be not serious enough without this patch for some NetBSD versions to fail but it's nevertheless there and this patch only exposes that. I see no way this patch could cause overwriting memory in itself unless there's some other problem existing without this patch already. That also means no ammount of changes to this patch will fix that problem.
For me the obvious answer is to use QEMU to determine why boot fails and add a standard /pci@f0000000 node as per your earlier patch that will then be guaranteed to work across all OSs, even ones that do not have existing test coverage, as we know this is what works on real hardware.
Attached is the dmesg from booting NetBSD 8.0 on my Mac Mini, along with a copy of the DTs for an iMac 99 and G4 AGP which are the closest DTs to the QEMU mac99 machine which I hope will be of help. I do understand how getting patches merged is always frustrating - it is something that happens to all of us - but once the issue with the /pci node has been fixed so that it is the same as a real Mac, compatibility becomes something we can simply forget about.
OK I've sent a v9 which completely recreates the /pci node matching the real hardware adding the so far missing properties but it makes no difference for NetBSD so it only proves your theory is wrong. Now you have 10 versions of this patch plus the dummy-pci one to chose from. All of these fixes the MorphOS boot problem and you can decide how many other guests you want to break or keep working but please take one of these and don't let users who use MorphOS with mac99 with broken machine in QEMU 6.2.
I do understand your frustration as I have been in similar situations myself where a patch I have written has exposed a bug somewhere else, and indeed this tends to happen quite a bit in QEMU development (or at least on the architectures I've worked with). It's not ideal, but unfortunately it happens and a single patch can turn into several patches. However both adding a bogus DT node which is visible to the guest or breaking existing guests are not the right options here for the reasons I've already explained.
This is beyond the case where getting in a patch also needs fixing some more surrounding things like code style or little improvements to clean up the code I touch. That's frustrating but I'm OK with that to a point where the additional work is within limits and did that in QEMU several times. E.g. I had pegasos2 emulation working a year earlier but did the additional work to clean up VT82c686 emulation so it could be added cleanly which delayed it by a year due to the limited time I had for it. But in this case we are talking about just a few lines patch adding a dummy pci node that would fix MorphOS boot and you haven't shown yet it would break anything but you can't accept this and demand instead that I should implement support for multiple PCI buses in OpenBIOS and also fix QEMU to precisely emulate the hardware or fix bugs totally unrelated to the issue I'm trying to fix with this patch. This is not reasonable.
My role as maintainer is to help guide the development of OpenBIOS and manage testing/merging as required. It is not my responsibility to perform exhaustive testing of every patch submitted because the submitter either doesn't want to, or doesn't has time to do it. Did you know that a complete set of boot tests for Mac takes over an hour? I've already tested several versions of your patch, and even performed boot tests on real hardware for comparison. As you can see I have already invested a huge amount of time into this including pointing you towards the NetBSD people and testing on real hardware, and I am also willing to get involved in further discussion on port-macppc where I can help.
So did I, I've made several versions of this patch (also testing it with the images I have or the ones you've shown had problems) and trying to follow your advice while keeping within the limits I'm willing to invest in this but none of the attempts were good enough for you. And this is going on for 3 years. As a maintainer you should manage the project so those who want to use it get a working version and allow to make progress for those who want to contribute. There are a few people I know are using MorphOS on QEMU (maybe 5, not a big number but probably more than those who want to use NetBSD 8.0 and 9.0) and they will be left with their setup broken with the next QEMU release due to you're trying to preserve some old NetBSD versions booting which maybe nobody uses and only booted by chance so far. But there's also a simple fix to keep both these NetBSD versions work and fix MorphOS too but that's not acceptable for you either. This makes it impossible to fix the problem and stops all progress whereas if you took the simple fix with dummy pci that would solve the problem and not make the situation much worse than it is now and allow to make a small progress also allowing MorphOS to work. This is not such a big step backwards that's unacceptable and does not prevent cleaning it up later so I don't know what you can't accept this as a solution. Unless you can show this dummy pci breaks some other guest in your tests, but I doubt it would. This has escalated from a simple patch to fix a problem to a more complex patch adding the complete PCI node but that exposed bugs already present and caused problems for other guests. So what would the next step be? I've spent about a month debugging the original problem with MorphOS including following in a debugger why it could not talk to USB devices becuase its developers did not help at all. Then tried to find a solution to patch OpenBIOS to work around this problem, first from Forth then from C which took another few weeks. Then made countless versions on your requests trying to meet your demands which also took time, I've even fixed an issue with OpenSUSE that also took hours to get the image and test it and also tried to find out what causes the problem with NetBSD but at the end we're back where we were uears ago that the dummy pci version is the one which causes the least problems and fixes the issue. So the solution is simple and available for years yet it's impossible for users to get a working version without compiling their own patched version. Is that a good service as a maintainer?
Finally whilst you may have a different opinion as to the role of a maintainer, the tone of language used is not appropriate: please reduce the accusatory nature of your comments when posting to the mailing list.
I'm sorry but I'm really upset about this not being able to get over this problem for years and still don't see you can propose an alternative solution you'd accept just keep saying it's not acceptable for any patches I send trying to fix this. We have a problem booting MorphOS that I've debugged and identified that it could be fixed by adding a node to OpenBIOS one way or other and there are users out there who would be helped by this working in QEMU but I can't convince you that this problem is worth fixing. Why you can't accept a solution that fixes the problem and does not break other guests? Is it better for users to have clean code that does not work or a bit less clean but one that works? (And the OpenBIOS code is not that clean to begin with there are a lot of differences from real OpenFirmware so a bit more should not hurt as long as it works and the result is more guests booting.)
I'd like to add that you seem to view this as some kind of theoretical question and reject it on philosophical grounds but my approach is solely practical: there's a problem with booting MorphOS that could be easily fixed, yet it's impossible because of your view on this and becuase of that I've spent much more time with this problem than it was necessary. It's nothing against you personally and it's not about who's right or wrong. It's just not fixing this problem makes the life of some people who want to continue using mac99 to boot MorphOS harder for no good reason. This is what I was trying to explain and sorry if I've offended you. Try to view this from a more practical view point and if the currently available full pci node or dummy pci node solutions are not acceptable then what practical solution do you have?
As a bonus I've also fixed compilation with gcc 10 but only because it was in the code I've contributed before. I'm at a point where it feels like wasting my time trying to contibute so I'm very close to give up on it.
Thanks, I'll have a look - although since we are in freeze it's likely this will be deferred until after QEMU 6.2 is released.
That's OK, however the MorphOS fix is something that can get in during the freeze and should be resolved. Please get one of the patches you're most happy with (or the least unhappy, I don't care) and get this solved at last.
Regards, BALATON Zoltan
On Mon, 8 Nov 2021, BALATON Zoltan wrote:
On Sun, 7 Nov 2021, BALATON Zoltan wrote:
On Sun, 7 Nov 2021, Mark Cave-Ayland wrote:
On 06/11/2021 17:26, BALATON Zoltan wrote:
On Sat, 6 Nov 2021, Mark Cave-Ayland wrote:
On 02/11/2021 15:39, BALATON Zoltan wrote:
On Tue, 26 Oct 2021, BALATON Zoltan wrote: > On Tue, 26 Oct 2021, Mark Cave-Ayland wrote: >> What was the response from the NetBSD port-macppc mailing list? The >> NetBSD guys use QEMU a lot as part of their CI/release engineering >> process and are generally very responsive and knowledgeable: at the >> very minimum they will be able to point you directly towards the code >> in question. > > I don't know. Are you subscribed to that list or know their IRC > channel you could ask them faster? Please cc me the answer. I've tried > my best to find this out but I don't have more time to spend with this > now.
What will be done about this so that QEMU 6.2 won't be released in a state that MorphOS can't boot on mac99 any more? Looks like NetBSD list can't help finding the problem and I'm not able to debug it. (I think this may be an existing bug only exposed by this patch not caused by it based on the response we got so far. It looks like there's some overlap in memory usage for some reason which causes parts to be overwritten and adding another /pci node probably causes more stuff to be overwritten so it breaks but otherwise it probably just overwrites less critical stuff and still boots by chance. This is likely since most versions don't boot either before or after this patch.)
I think we have two options for the next QEMU release given that we only have one week left for bug fixes:
- Take the original v8 patch adding the pci node with actual pci bus
info and accept this breaks NetBSD 8.0 and 9,0 (but 9.2 still works).
or
- Take the dummy-pci patch that does not break anything but fixes the
MorphOS boot. (This may be a hack but it could be cleaned up later and does the job for now.)
What's your decision on this?
Firstly option 1 is not a viable choice because it would break booting NetBSD 8.0/9.0 for people that have existing disk images. But then option 2 is not
I think this breakage is not caused by this patch but only exposed by it. Therefore it's also not possible to fix it in this patch so don't blame this patch but look for the problem elsewhere and fix it there. This should be no reason to withhold this patch but if it is I've provided an alternative with the dummy-pci that did not break this case but still fixed the problem I'd like to fix.
suitable in that it moves the DT further away from that of a real Mac: the aim should be to move the emulation closer to real life rather than further from it.
OK but then is there an option that is viable and possible to do for QEMU 6.2 release?
I also would not agree with your statement that the patch doesn't break anything, since the fact regressions caused by earlier versions of this patch have only been caught by my pre-merge tests highlight the problem of testing coverage.
It's very unlikely to break anything but I can't be sure for at least two reasons: you did not publish what images you test with so it's impossible for me to make sure your tests will pass but secondly even if it was public I don't have the time to do all this testing. You've volunteered to do that as a maintainer so please do your job and test the dummy-pci patch and only say it will break anything if you can show what. Don't discard it based on belief it might break something. That argument is true for any and all patches.
Introducing a bogus DT node is a highly visible change to introduce to the guest, and propagating such an entry into OSs and device trees is at the minimum going to cause confusion and at worst cause more regressions.
Confusion for whom and why should it cause regressions? I don't believe this.
I had a look at the port-macppc archives and I saw the discussion around possible corruption but there didn't seem to be any further analysis or conclusive outcome. I spent some time this morning burning the NetBSD 8.0 ISO to a CD and booting it on my G4 Mac Mini and it booted into userspace fine. On this basis we know that a DT with multiple pci nodes boots fine with that QEMU version, so any bugs must be either in the extra /pci node contents or how the PCI bus is mapped for the mac99 machine.
This test is not suffucient to prove this patch causes the problem. Do another test: take NetBSD 8.1 or 9.1 that currently does not boot on QEMU master and verify that boots on your Mac mini. If it does then I think that proves there's a bug elsewhere which just happen to be not serious enough without this patch for some NetBSD versions to fail but it's nevertheless there and this patch only exposes that. I see no way this patch could cause overwriting memory in itself unless there's some other problem existing without this patch already. That also means no ammount of changes to this patch will fix that problem.
For me the obvious answer is to use QEMU to determine why boot fails and add a standard /pci@f0000000 node as per your earlier patch that will then be guaranteed to work across all OSs, even ones that do not have existing test coverage, as we know this is what works on real hardware.
Attached is the dmesg from booting NetBSD 8.0 on my Mac Mini, along with a copy of the DTs for an iMac 99 and G4 AGP which are the closest DTs to the QEMU mac99 machine which I hope will be of help. I do understand how getting patches merged is always frustrating - it is something that happens to all of us - but once the issue with the /pci node has been fixed so that it is the same as a real Mac, compatibility becomes something we can simply forget about.
OK I've sent a v9 which completely recreates the /pci node matching the real hardware adding the so far missing properties but it makes no difference for NetBSD so it only proves your theory is wrong. Now you have 10 versions of this patch plus the dummy-pci one to chose from. All of these fixes the MorphOS boot problem and you can decide how many other guests you want to break or keep working but please take one of these and don't let users who use MorphOS with mac99 with broken machine in QEMU 6.2.
I do understand your frustration as I have been in similar situations myself where a patch I have written has exposed a bug somewhere else, and indeed this tends to happen quite a bit in QEMU development (or at least on the architectures I've worked with). It's not ideal, but unfortunately it happens and a single patch can turn into several patches. However both adding a bogus DT node which is visible to the guest or breaking existing guests are not the right options here for the reasons I've already explained.
This is beyond the case where getting in a patch also needs fixing some more surrounding things like code style or little improvements to clean up the code I touch. That's frustrating but I'm OK with that to a point where the additional work is within limits and did that in QEMU several times. E.g. I had pegasos2 emulation working a year earlier but did the additional work to clean up VT82c686 emulation so it could be added cleanly which delayed it by a year due to the limited time I had for it. But in this case we are talking about just a few lines patch adding a dummy pci node that would fix MorphOS boot and you haven't shown yet it would break anything but you can't accept this and demand instead that I should implement support for multiple PCI buses in OpenBIOS and also fix QEMU to precisely emulate the hardware or fix bugs totally unrelated to the issue I'm trying to fix with this patch. This is not reasonable.
My role as maintainer is to help guide the development of OpenBIOS and manage testing/merging as required. It is not my responsibility to perform exhaustive testing of every patch submitted because the submitter either doesn't want to, or doesn't has time to do it. Did you know that a complete set of boot tests for Mac takes over an hour? I've already tested several versions of your patch, and even performed boot tests on real hardware for comparison. As you can see I have already invested a huge amount of time into this including pointing you towards the NetBSD people and testing on real hardware, and I am also willing to get involved in further discussion on port-macppc where I can help.
So did I, I've made several versions of this patch (also testing it with the images I have or the ones you've shown had problems) and trying to follow your advice while keeping within the limits I'm willing to invest in this but none of the attempts were good enough for you. And this is going on for 3 years. As a maintainer you should manage the project so those who want to use it get a working version and allow to make progress for those who want to contribute. There are a few people I know are using MorphOS on QEMU (maybe 5, not a big number but probably more than those who want to use NetBSD 8.0 and 9.0) and they will be left with their setup broken with the next QEMU release due to you're trying to preserve some old NetBSD versions booting which maybe nobody uses and only booted by chance so far. But there's also a simple fix to keep both these NetBSD versions work and fix MorphOS too but that's not acceptable for you either. This makes it impossible to fix the problem and stops all progress whereas if you took the simple fix with dummy pci that would solve the problem and not make the situation much worse than it is now and allow to make a small progress also allowing MorphOS to work. This is not such a big step backwards that's unacceptable and does not prevent cleaning it up later so I don't know what you can't accept this as a solution. Unless you can show this dummy pci breaks some other guest in your tests, but I doubt it would. This has escalated from a simple patch to fix a problem to a more complex patch adding the complete PCI node but that exposed bugs already present and caused problems for other guests. So what would the next step be? I've spent about a month debugging the original problem with MorphOS including following in a debugger why it could not talk to USB devices becuase its developers did not help at all. Then tried to find a solution to patch OpenBIOS to work around this problem, first from Forth then from C which took another few weeks. Then made countless versions on your requests trying to meet your demands which also took time, I've even fixed an issue with OpenSUSE that also took hours to get the image and test it and also tried to find out what causes the problem with NetBSD but at the end we're back where we were uears ago that the dummy pci version is the one which causes the least problems and fixes the issue. So the solution is simple and available for years yet it's impossible for users to get a working version without compiling their own patched version. Is that a good service as a maintainer?
Finally whilst you may have a different opinion as to the role of a maintainer, the tone of language used is not appropriate: please reduce the accusatory nature of your comments when posting to the mailing list.
I'm sorry but I'm really upset about this not being able to get over this problem for years and still don't see you can propose an alternative solution you'd accept just keep saying it's not acceptable for any patches I send trying to fix this. We have a problem booting MorphOS that I've debugged and identified that it could be fixed by adding a node to OpenBIOS one way or other and there are users out there who would be helped by this working in QEMU but I can't convince you that this problem is worth fixing. Why you can't accept a solution that fixes the problem and does not break other guests? Is it better for users to have clean code that does not work or a bit less clean but one that works? (And the OpenBIOS code is not that clean to begin with there are a lot of differences from real OpenFirmware so a bit more should not hurt as long as it works and the result is more guests booting.)
I'd like to add that you seem to view this as some kind of theoretical question and reject it on philosophical grounds but my approach is solely practical: there's a problem with booting MorphOS that could be easily fixed, yet it's impossible because of your view on this and becuase of that I've spent much more time with this problem than it was necessary. It's nothing against you personally and it's not about who's right or wrong. It's just not fixing this problem makes the life of some people who want to continue using mac99 to boot MorphOS harder for no good reason. This is what I was trying to explain and sorry if I've offended you. Try to view this from a more practical view point and if the currently available full pci node or dummy pci node solutions are not acceptable then what practical solution do you have?
Please don't ignore this problem, it won't go away by itself. Since you haven't demonstrated so far that the dummy-pci patch breaks anything I assume it either works or you haven't tested it at all. If the latter then why are you so reluctant to even test it and consider it as a fix until a more complete fix you're happy with is available sometimes in the future (does not seem to happen for three years so it's not somethiung I expect until the next QEMU release). We still have a chance to fix this for QEMU 6.2 but if you don't do anything about it it will stop working for those already using MorphOS with mac99. Do you plan to do anything about this or at least please give an explanation why it can't be fixed for 6.2 if you decided it won't be fixed.
Regards, BALATON Zoltan
As a bonus I've also fixed compilation with gcc 10 but only because it was in the code I've contributed before. I'm at a point where it feels like wasting my time trying to contibute so I'm very close to give up on it.
Thanks, I'll have a look - although since we are in freeze it's likely this will be deferred until after QEMU 6.2 is released.
That's OK, however the MorphOS fix is something that can get in during the freeze and should be resolved. Please get one of the patches you're most happy with (or the least unhappy, I don't care) and get this solved at last.
Regards, BALATON Zoltan
OpenBIOS mailing list -- openbios@openbios.org To unsubscribe send an email to openbios-leave@openbios.org
On 13/11/2021 18:56, BALATON Zoltan wrote:
Please don't ignore this problem, it won't go away by itself. Since you haven't demonstrated so far that the dummy-pci patch breaks anything I assume it either works or you haven't tested it at all. If the latter then why are you so reluctant to even test it and consider it as a fix until a more complete fix you're happy with is available sometimes in the future (does not seem to happen for three years so it's not somethiung I expect until the next QEMU release). We still have a chance to fix this for QEMU 6.2 but if you don't do anything about it it will stop working for those already using MorphOS with mac99. Do you plan to do anything about this or at least please give an explanation why it can't be fixed for 6.2 if you decided it won't be fixed.
Regards, BALATON Zoltan
I'm not ignoring this problem at all: for the reasons previously mentioned in this thread I am not accepting the dummy-pci patch, and so haven't spent any time testing it. I don't see how repeating myself in this reply once again will make any difference. I also stand by my offer to provide help/advice on creating the real /pci node if you are willing to investigate the NetBSD issue.
Given that MorphOS has never worked with vanilla OpenBIOS then there is no regression here to warrant breaking the soft freeze for QEMU 6.2.
ATB,
Mark.
On Sun, 14 Nov 2021, Mark Cave-Ayland wrote:
On 13/11/2021 18:56, BALATON Zoltan wrote:
Please don't ignore this problem, it won't go away by itself. Since you haven't demonstrated so far that the dummy-pci patch breaks anything I assume it either works or you haven't tested it at all. If the latter then why are you so reluctant to even test it and consider it as a fix until a more complete fix you're happy with is available sometimes in the future (does not seem to happen for three years so it's not somethiung I expect until the next QEMU release). We still have a chance to fix this for QEMU 6.2 but if you don't do anything about it it will stop working for those already using MorphOS with mac99. Do you plan to do anything about this or at least please give an explanation why it can't be fixed for 6.2 if you decided it won't be fixed.
Regards, BALATON Zoltan
I'm not ignoring this problem at all: for the reasons previously mentioned in this thread I am not accepting the dummy-pci patch, and so haven't spent any time testing it. I don't see how repeating myself in this reply once again
It's a pity because I think this would be the easiest and simplest fix that would both solve the problem for now and keep everyting else working that worked before until OpenBIOS gains support for multiple PCI buses or the NetBSD folks provide some help,
will make any difference. I also stand by my offer to provide help/advice on creating the real /pci node if you are willing to investigate the NetBSD issue.
I've tried and provided the patches with the full pci node, even several versions with different amount of info including the last version that adds everything the real hardware has and also spent about a weekend with the NetBSD issue even writing to their mailing list but since this seems to be a pre-existing problem not caused by the patch only exposed by it and I don't have much free time I don't want to spend more time with this only to then you come up with another guest that has some problems and we're back where we are now. Did you test that versions of NetBSD that don't boot on QEMU now do boot on real hardware? I think that would prove the issue is there regardless of the pci patch. It probably just causes more things to be overwritten that causes breakage whereas before it only breaks non-essential stuff by chance so it still boots.
Given that MorphOS has never worked with vanilla OpenBIOS then there is no regression here to warrant breaking the soft freeze for QEMU 6.2.
It only never worked because you refused to accept any patches I've sent so far, otherwise it would have worked well since 3 years. Also fixing a bug is not breaking the freeze, that's exacly for fixing known or discovered problems so the release won't have them. The regression is that due to the serial reset changes the patched OpenBIOS I've provided until this could be upstreamed (and still used but users) is also stopped working now. I could publish another patched version but that would create confusion because it may not work with older QEMU versions and a lot of users just use whatever version comes with their binary distribution and may not know which patched OpenBIOS version to use so just give up on it if it does not work at first. Therefore I don't plan to provide another patched OpenBIOS. Either this can be fixed upstream now or people will have to stop using mac99 and migrate to pegasos2 instead which is slightly unconvenient because they may need to replace their boot partition and boot file but pegasos2 can boot the MorphOS kernel directly with -kernel and no firmware so maybe bearable. But it would still be nicer if people who have working mac99 setups would be able to continue using it with QEMU 6.2 and a simple fix is available for that (or one that meets your standards just happens to bring out an issue with some old NetBSD versions) so you could chose from not one but two solutions even if none of them perfect.
In my opinion a more perfect solution would be to use real OpenFirmware instead that would really fix all compatibilty problems. If anything I would be willing to put effort in that otherwise it's just wasted time for me. Why reimplement OpenFirmware now that it's available open source? We would only need to write the drivers for the mac hardware for it some of which may be available in the Sun version which is also open source (like escc and sungem) and I've done some work before to pass device tree from QEMU to OpenFirmware like SLOF does so maybe it's not that far away to make it work for Mac machines too. That would really fix all problems including passing through graphics cards that with OpenBIOS also need patches currently.
Regards, BALATON Zoltan
Hi!
On Sun, Nov 14, 2021 at 02:17:23PM +0100, BALATON Zoltan wrote:
In my opinion a more perfect solution would be to use real OpenFirmware instead that would really fix all compatibilty problems. If anything I would be willing to put effort in that otherwise it's just wasted time for me. Why reimplement OpenFirmware now that it's available open source? We would only need to write the drivers for the mac hardware for it some of which may be available in the Sun version which is also open source (like escc and sungem)
Fwiw, this will not be compatible to Apple OF at all. There are very many things, small and large and huge, where Apple OF deviates from the 1275 standard. Also all OSes that run on Macs have their own special little (or huge) requirements as well.
Note that the Apple 8530 and GEM are incompatible to the original as well, in some crucial ways. Nothing that cannot be overcome, but don't be surprised if things do not work out of the box.
Good luck and have fun,
Segher
On Sun, 14 Nov 2021, Segher Boessenkool wrote:
On Sun, Nov 14, 2021 at 02:17:23PM +0100, BALATON Zoltan wrote:
In my opinion a more perfect solution would be to use real OpenFirmware instead that would really fix all compatibilty problems. If anything I would be willing to put effort in that otherwise it's just wasted time for me. Why reimplement OpenFirmware now that it's available open source? We would only need to write the drivers for the mac hardware for it some of which may be available in the Sun version which is also open source (like escc and sungem)
Fwiw, this will not be compatible to Apple OF at all. There are very many things, small and large and huge, where Apple OF deviates from the 1275 standard. Also all OSes that run on Macs have their own special little (or huge) requirements as well.
The basic OF things enough to boot these OSes are probably not that much different after all OpenBIOS also implements only part of the standard and it can boot some of these OSes. Maybe a larger missing part would be the Apple HFS, partition table and bootinfo support that would need to be reimplemented in Forth as these are in C in OpenBIOS. But OF should already have a lot of other things OpenBIOS is missing like those needed to run card Fcode ROMs or guest boot codes. So both ways need some work.
Note that the Apple 8530 and GEM are incompatible to the original as well, in some crucial ways. Nothing that cannot be overcome, but don't be surprised if things do not work out of the box.
Which would be more work? Implement everything missing in OpenBIOS compared to OpenFirmware or implement the missing parts in OpenFirmware to reach the current state of OpenBIOS? I don't know, but Apple's OF is likely closer to the open source OLPC OpenFirmware than to OpenBIOS as they share origins so it may be more compatible with that. (For Sun we would only need to compile the published version which should be mostly identical to the original but to reduce the number of versions it might be better to merge them if possible. The third machine that needs OF in QEMU is prep for which Artyom already made OF work. I've considered it for pegasos2 as well but went with VOF at last that became available in the meantime.)
Good luck and have fun,
Was just an idea anyway and only mentioned it to explain why I don't want to put too much effort in OpenBIOS. I'd rather have some fun instead.
Regards, BALATON Zoltan
Hi!
On Sun, Nov 14, 2021 at 04:16:47PM +0100, BALATON Zoltan wrote:
On Sun, 14 Nov 2021, Segher Boessenkool wrote: Which would be more work? Implement everything missing in OpenBIOS compared to OpenFirmware or implement the missing parts in OpenFirmware to reach the current state of OpenBIOS? I don't know, but Apple's OF is likely closer to the open source OLPC OpenFirmware than to OpenBIOS as they share origins so it may be more compatible with that.
They do not share heritage. Apple OF is a fully separate implementation, with all of its own bugs and extensions and peculiarities.
Good luck and have fun,
Was just an idea anyway and only mentioned it to explain why I don't want to put too much effort in OpenBIOS. I'd rather have some fun instead.
I meant that seriously :-) I don't say these things to discourage you from taking any path. Instead, I tell tales of what monsters lurk where, in the hope that your journey will be less frustrating.
Good luck and have fun!
Segher
On Sun, 14 Nov 2021, Mark Cave-Ayland wrote:
I also stand by my offer to provide help/advice on creating the real /pci node if you are willing to investigate the NetBSD issue.
So can you test on real hardware if the problem also exists on versions of NetBSD that aren't booting now on QEMU such as those between 8.1 and 9.1 other than 9.0? That would give a clue if the breakage is related to the PCI patch at all because if it isn't then we should look for the problem elsewhere. I've asked this several times but got no answer yet.
Then if it is a memory corruption problem I think we should compare how memory is allocated on real hardware vs. OpenBIOS but I have no idea how to do that nor any real hardware to test on. Some memory info is printed in the dmesg you've sent but that may not be enough info and more detailed knowledge about the NetBSD kernel may be needed to know what to look for.
If you're willing to help please do so and try to do the above tests to get closer to understanding what the problem actually is.
Regards, BALATON Zoltan
On Mon, 15 Nov 2021, BALATON Zoltan wrote:
On Sun, 14 Nov 2021, Mark Cave-Ayland wrote:
I also stand by my offer to provide help/advice on creating the real /pci node if you are willing to investigate the NetBSD issue.
So can you test on real hardware if the problem also exists on versions of NetBSD that aren't booting now on QEMU such as those between 8.1 and 9.1 other than 9.0? That would give a clue if the breakage is related to the PCI patch at all because if it isn't then we should look for the problem elsewhere. I've asked this several times but got no answer yet.
Then if it is a memory corruption problem I think we should compare how memory is allocated on real hardware vs. OpenBIOS but I have no idea how to do that nor any real hardware to test on. Some memory info is printed in the dmesg you've sent but that may not be enough info and more detailed knowledge about the NetBSD kernel may be needed to know what to look for.
If you're willing to help please do so and try to do the above tests to get closer to understanding what the problem actually is.
Also, I've done the following test before arriving at the dummy-pci version but now I've repeated it again to verify and you can reproduce it if you want:
1. Start from QEMU master, check that NetBSD-8.0-macppc.iso gets to the prompt as you've shown with upstream OpenBIOS.
2. Add -bios openbios-qemu.elf with OpenBIOS patched with the last v9 version of the patch that adds the full /pci node with all properties. This fails with the "bad dir ino 10" error as you said.
3. Then edit openbios/arch/ppc/qemu/init.c and change the name of the added pci node to something else than pci say dummy-pci or nonpci or whatever leaving all other properties there. This boots like in 1.
4. Now revert the add pci node patch and apply the dummy-pci patch instead which only adds a node named dummy-pci with type pci that's enough to fix MorphOS. NetBSD also boots with this like in 1. above.
5. Then edit openbios/arch/ppc/qemu/init.c and change it to add a node named pci but without any properties, such as:
case ARCH_MAC99: /* This adds a dummy node of pci device-type before the actual /pci * node which is needed for MorphOS to find devices on PCI bus. * (Real machine has 3 /pci nodes but we only have one.) */ fword("new-device"); push_str("pci"); fword("device-name"); fword("finish-device"); /* fall through */ case ARCH_MAC99_U3: /* The NewWorld NVRAM is not located in the MacIO device */ macio_nvram_init("/", 0); ob_pci_init();
or compared to the dummy-pci patch:
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index ef8b27a..a33623d 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -869,10 +869,8 @@ arch_of_init(void) * node which is needed for MorphOS to find devices on PCI bus. * (Real machine has 3 /pci nodes but we only have one.) */ fword("new-device"); - push_str("dummy-pci"); - fword("device-name"); push_str("pci"); - fword("device-type"); + fword("device-name"); fword("finish-device"); /* fall through */ case ARCH_MAC99_U3:
which results in
0 > cd / ok 0 > ls fff55e44 aliases fff55ee8 openprom fff56090 options fff56108 chosen fff561b8 builtin fff5bdf0 packages fff5ecf8 cpus fff5edf8 memory@0 fff5eec0 rom@ff800000 fff61cf4 pci fff61d68 nvram@fff04000 fff62000 pci@f2000000 fff6ae74 uni-n@f8000000 ok 0 > cd pci ok 0 > .properties name "pci" ok
and this again gets the "bad dir ino 10" error with NetBSD 8.0.
After this I've concluded there's no way I can solve this by adding, removing or changing the properties in this patch that you've been asking for all along and submitted the dummy-pci patch instead as the minimal and only viable alternative I can see to solve this in the short term without breaking NetBSD versions that are booting now. This was three weeks ago and I'm trying to explain this to you but I'm not sure I've managed to get it across. Repeat the above experiment if you don't believe and say what to do next. We only have about a week to solve this.
Regards, BALATON Zoltan
On Tue, 16 Nov 2021, BALATON Zoltan wrote:
On Mon, 15 Nov 2021, BALATON Zoltan wrote:
On Sun, 14 Nov 2021, Mark Cave-Ayland wrote:
I also stand by my offer to provide help/advice on creating the real /pci node if you are willing to investigate the NetBSD issue.
So can you test on real hardware if the problem also exists on versions of NetBSD that aren't booting now on QEMU such as those between 8.1 and 9.1 other than 9.0? That would give a clue if the breakage is related to the PCI patch at all because if it isn't then we should look for the problem elsewhere. I've asked this several times but got no answer yet.
Then if it is a memory corruption problem I think we should compare how memory is allocated on real hardware vs. OpenBIOS but I have no idea how to do that nor any real hardware to test on. Some memory info is printed in the dmesg you've sent but that may not be enough info and more detailed knowledge about the NetBSD kernel may be needed to know what to look for.
If you're willing to help please do so and try to do the above tests to get closer to understanding what the problem actually is.
Also, I've done the following test before arriving at the dummy-pci version but now I've repeated it again to verify and you can reproduce it if you want:
- Start from QEMU master, check that NetBSD-8.0-macppc.iso gets to the
prompt as you've shown with upstream OpenBIOS.
- Add -bios openbios-qemu.elf with OpenBIOS patched with the last v9 version
of the patch that adds the full /pci node with all properties. This fails with the "bad dir ino 10" error as you said.
- Then edit openbios/arch/ppc/qemu/init.c and change the name of the added
pci node to something else than pci say dummy-pci or nonpci or whatever leaving all other properties there. This boots like in 1.
- Now revert the add pci node patch and apply the dummy-pci patch instead
which only adds a node named dummy-pci with type pci that's enough to fix MorphOS. NetBSD also boots with this like in 1. above.
- Then edit openbios/arch/ppc/qemu/init.c and change it to add a node named
pci but without any properties, such as:
case ARCH_MAC99: /* This adds a dummy node of pci device-type before the actual /pci * node which is needed for MorphOS to find devices on PCI bus. * (Real machine has 3 /pci nodes but we only have one.) */ fword("new-device"); push_str("pci"); fword("device-name"); fword("finish-device"); /* fall through */ case ARCH_MAC99_U3: /* The NewWorld NVRAM is not located in the MacIO device */ macio_nvram_init("/", 0); ob_pci_init();
or compared to the dummy-pci patch:
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index ef8b27a..a33623d 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -869,10 +869,8 @@ arch_of_init(void) * node which is needed for MorphOS to find devices on PCI bus. * (Real machine has 3 /pci nodes but we only have one.) */ fword("new-device");
push_str("dummy-pci");
fword("device-name"); push_str("pci");
fword("device-type");
case ARCH_MAC99_U3:fword("device-name"); fword("finish-device"); /* fall through */
which results in
0 > cd / ok 0 > ls fff55e44 aliases fff55ee8 openprom fff56090 options fff56108 chosen fff561b8 builtin fff5bdf0 packages fff5ecf8 cpus fff5edf8 memory@0 fff5eec0 rom@ff800000 fff61cf4 pci fff61d68 nvram@fff04000 fff62000 pci@f2000000 fff6ae74 uni-n@f8000000 ok 0 > cd pci ok 0 > .properties name "pci" ok
and this again gets the "bad dir ino 10" error with NetBSD 8.0.
After this I've concluded there's no way I can solve this by adding, removing or changing the properties in this patch that you've been asking for all along and submitted the dummy-pci patch instead as the minimal and only viable alternative I can see to solve this in the short term without breaking NetBSD versions that are booting now. This was three weeks ago and I'm trying to explain this to you but I'm not sure I've managed to get it across. Repeat the above experiment if you don't believe and say what to do next. We only have about a week to solve this.
What's the plan for fixing this for 6.2? We're running out of time and I don't see you make any progress with this.
Regards, BALATON Zoltan
On 20/11/2021 11:34, BALATON Zoltan wrote:
On Tue, 16 Nov 2021, BALATON Zoltan wrote:
On Mon, 15 Nov 2021, BALATON Zoltan wrote:
On Sun, 14 Nov 2021, Mark Cave-Ayland wrote:
I also stand by my offer to provide help/advice on creating the real /pci node if you are willing to investigate the NetBSD issue.
So can you test on real hardware if the problem also exists on versions of NetBSD that aren't booting now on QEMU such as those between 8.1 and 9.1 other than 9.0? That would give a clue if the breakage is related to the PCI patch at all because if it isn't then we should look for the problem elsewhere. I've asked this several times but got no answer yet.
Then if it is a memory corruption problem I think we should compare how memory is allocated on real hardware vs. OpenBIOS but I have no idea how to do that nor any real hardware to test on. Some memory info is printed in the dmesg you've sent but that may not be enough info and more detailed knowledge about the NetBSD kernel may be needed to know what to look for.
If you're willing to help please do so and try to do the above tests to get closer to understanding what the problem actually is.
Also, I've done the following test before arriving at the dummy-pci version but now I've repeated it again to verify and you can reproduce it if you want:
- Start from QEMU master, check that NetBSD-8.0-macppc.iso gets to the prompt as
you've shown with upstream OpenBIOS.
- Add -bios openbios-qemu.elf with OpenBIOS patched with the last v9 version of
the patch that adds the full /pci node with all properties. This fails with the "bad dir ino 10" error as you said.
- Then edit openbios/arch/ppc/qemu/init.c and change the name of the added pci
node to something else than pci say dummy-pci or nonpci or whatever leaving all other properties there. This boots like in 1.
- Now revert the add pci node patch and apply the dummy-pci patch instead which
only adds a node named dummy-pci with type pci that's enough to fix MorphOS. NetBSD also boots with this like in 1. above.
- Then edit openbios/arch/ppc/qemu/init.c and change it to add a node named pci
but without any properties, such as:
case ARCH_MAC99: /* This adds a dummy node of pci device-type before the actual /pci * node which is needed for MorphOS to find devices on PCI bus. * (Real machine has 3 /pci nodes but we only have one.) */ fword("new-device"); push_str("pci"); fword("device-name"); fword("finish-device"); /* fall through */ case ARCH_MAC99_U3: /* The NewWorld NVRAM is not located in the MacIO device */ macio_nvram_init("/", 0); ob_pci_init();
or compared to the dummy-pci patch:
diff --git a/arch/ppc/qemu/init.c b/arch/ppc/qemu/init.c index ef8b27a..a33623d 100644 --- a/arch/ppc/qemu/init.c +++ b/arch/ppc/qemu/init.c @@ -869,10 +869,8 @@ arch_of_init(void) * node which is needed for MorphOS to find devices on PCI bus. * (Real machine has 3 /pci nodes but we only have one.) */ fword("new-device"); - push_str("dummy-pci"); - fword("device-name"); push_str("pci"); - fword("device-type"); + fword("device-name"); fword("finish-device"); /* fall through */ case ARCH_MAC99_U3:
which results in
0 > cd / ok 0 > ls fff55e44 aliases fff55ee8 openprom fff56090 options fff56108 chosen fff561b8 builtin fff5bdf0 packages fff5ecf8 cpus fff5edf8 memory@0 fff5eec0 rom@ff800000 fff61cf4 pci fff61d68 nvram@fff04000 fff62000 pci@f2000000 fff6ae74 uni-n@f8000000 ok 0 > cd pci ok 0 > .properties name "pci" ok
and this again gets the "bad dir ino 10" error with NetBSD 8.0.
After this I've concluded there's no way I can solve this by adding, removing or changing the properties in this patch that you've been asking for all along and submitted the dummy-pci patch instead as the minimal and only viable alternative I can see to solve this in the short term without breaking NetBSD versions that are booting now. This was three weeks ago and I'm trying to explain this to you but I'm not sure I've managed to get it across. Repeat the above experiment if you don't believe and say what to do next. We only have about a week to solve this.
What's the plan for fixing this for 6.2? We're running out of time and I don't see you make any progress with this.
Nothing has changed from the earlier discussions: the pci node patch in its current form causes at least one regression, and the dummy-pci node patch exposes a bogus DT node to the guest so neither can be merged for now.
This is the last time I am going to repeat myself in this thread.
ATB,
Mark.