[OpenBIOS] Read errors from SILO booting Debian Wheezy/sparc64

Artyom Tarasenko atar4qemu at gmail.com
Sun Apr 28 23:56:51 CEST 2013


On Sun, Apr 28, 2013 at 10:29 PM, Mark Cave-Ayland
<mark.cave-ayland at ilande.co.uk> wrote:
> On 27/04/13 07:06, Artyom Tarasenko wrote:
>
>>> Is this a regression? If so, what was the last know good revision?
>>
>>
>> Doesn't look like a regression. Checked the OpenBIOS image packaged with
>> QEMU
>> (they didn't update it recently) and it fails too.
>> Just haven't tested it before. Usually I boot from a CD/DVD image or
>> load the kernel directly with the -kernel option.
>
>
> Ha okay - I finally figured this one out with a reproducible test case like
> this:
>
>
> 1) First drop the Linux filesystem cache:
> echo 3 > /proc/sys/vm/drop_caches
>
> 2) Start up QEMU - this now blows up in SILO:
> ./qemu-system-sparc64 -hda
> /home/build/src/qemu/image/sparc64/debian-wheezy.qcow2 -boot c -nographic
> -snapshot
>
> 3) Load the debian-wheezy.qcow2 image into the filesystem cache
>
> cat debian-wheezy.qcow2 > /dev/null
>
> 4) Restart QEMU again and the image boots
> ./qemu-system-sparc64 -hda
> /home/build/src/qemu/image/sparc64/debian-wheezy.qcow2 -boot c -nographic
> -snapshot
>
> Once I could reproduce this at will, I recompiled OpenBIOS with
> CONFIG_DEBUG_IDE enabled and got the following output:
>
> Jumping to entry point 0000000000004000 for type 0000000000000005...
> switching to new context: entry point 0x4000 stack 0x00000000ffe86a39
> SIDE - ob_ide_open: opening channel -1539032 unit 0
> IDE DRIVE @ffe88428:
> unit: 0
> present: 1
> type: 1
> media: 32
> model: QEMU HARDDISK
> nr: 0
> cyl: 8322
> head: 16
> sect: 63
> bs: 512
> IDE - ob_ide_block_size: ob_ide_block_size: block size 200
> IDE - ob_ide_max_transfer: max_transfer 1fe00
> IDE - ob_ide_read_blocks: ob_ide_read_blocks ffe86b20 block=0 n=1
> IDE - ob_ide_read_sectors: ob_ide_read_sectors: block=0 sectors=1
> IDE - ob_ide_read_blocks: ob_ide_read_blocks ffe86b20 block=0 n=1
> IDE - ob_ide_read_sectors: ob_ide_read_sectors: block=0 sectors=1
> IDE - ob_ide_read_blocks: ob_ide_read_blocks ffea9e98 block=2 n=2
> IDE - ob_ide_read_sectors: ob_ide_read_sectors: block=2 sectors=2
> IDE - ob_ide_read_blocks: ob_ide_read_blocks 41dc block=5248 n=1
> IDE - ob_ide_read_sectors: ob_ide_read_sectors: block=5248 sectors=1
> IDE - ob_ide_error: ob_ide_error drive<0>: timed out waiting for BUSY clear:
> IDE - ob_ide_error:     cmd=20, stat=d0IDE - ob_ide_error:
> IDE - ob_ide_pio_data_in: bytes=512, stat=d0
> IDE - ob_ide_read_blocks: ob_ide_read_blocks: error
> EXIT
> 0 >
>
> Looking at drivers/ide.c, it was apparent that we were timing out waiting
> for the IDE event to finish:
>
> /*
> * wait for BUSY clear
> */
> if (ob_ide_wait_stat(drive, 0, BUSY_STAT | ERR_STAT, &stat)) {
>         ob_ide_error(drive, stat, "timed out waiting for BUSY clear");
>         cmd->stat = stat;
>         break;
> }
>
> And ob_ide_wait_stat() has a loop that looks like this:
>
> for (i = 0; i < 5000; i++) {
>         stat = ob_ide_pio_readb(drive, IDEREG_STATUS);
>         if (!(stat & BUSY_STAT))
>                 break;
>
>         udelay(1000);
> }
>
> This led me to wonder if there was a problem with udelay() returning too
> quickly and so I had a look at the implementation in arch/sparc64/openbios.c
> which looks like this:
>
>  void udelay(unsigned int usecs)
>  {
>  }
>
> Ah. So the problem is that we are not waiting long enough for the IDE
> command to finish which is why we are getting the error. A quick patch to
> get things working again for me is:
>
> --- a/openbios-devel/arch/sparc64/openbios.c
> +++ b/openbios-devel/arch/sparc64/openbios.c
> @@ -547,6 +547,9 @@ void setup_timers(void)
>
>  void udelay(unsigned int usecs)
>  {
> +    volatile int i;
> +
> +    for (i = 0; i < usecs * 100; i++);
>  }

Ha. Told ya, it's a race in OpenBIOS. :-) .

> But we probably need a better solution for this that works for both SPARC32
> and SPARC64. I've seen some code that waits for the RTC to tick 1s and
> measures the number of loop iterations within that time to find a basic
> timing constant which looks quite appealing as it is simple and will work
> across both SPARC32 and SPARC64 (SPARC32 doesn't have a %tick register).

AFAIK that's what OBP does on sun4m. Obviously worked good back then,
but nowadays it's not so attractive: even bare metal CPUs may have
frequency scaling. In the virtual world the number of instructions per
millisecond is even more volatile. Also some day the TCG optimiser may
just optimise the empty loops away.
If we need a precise sleep, we need timer interrupts, or a paravirtual
sleep device.
Though in this particular case, why not just replace "5000" with some
value large enough? Reading from a hardware register is a good delay
itself.

Artyom

> Blue - any thoughts on how to implement this?


--
Regards,
Artyom Tarasenko

linux/sparc and solaris/sparc under qemu blog:
http://tyom.blogspot.com/search/label/qemu



More information about the OpenBIOS mailing list