[OpenBIOS] Help with libopenbios/ofmem_common.c line 175 "insert in the freelist"

Mark Cave-Ayland mark.cave-ayland at siriusit.co.uk
Tue Aug 9 15:43:08 CEST 2011


On 08/08/11 22:40, Kenneth Salerno wrote:

> Hi,
>
> I finally got around to running QEMU in gdb to debug the AIX V6.1 boot hang and was able to get past where I got stuck previously:
>
> -------------------------------------------------------------------------------
>                                   Welcome to AIX.
>                         boot image timestamp: 00:39 35/2D
> NULL ihandle                 The current time and date: 00:00:00 228784/00/0008
>          processor count: 1;  memory size: 1024MB;  kernel size: 2293829
>                      boot device: cd:\ppc\chrp\bootfile.exe
> Validation failed: the "/rtas" device node does not exist.
> EXIT
>
>
> It used to hang after "boot device: cd:\ppc\chrp\bootfile.exe" with OpenBIOS stuck at line 175 of libopenbios/ofmem_common.c:
>
>    for( pp=&ofmem->mfree; *pp&&  (**pp).size<  d->size ; pp =&(**pp).next ) {
>    }
>
>
> I made the following hack to get it to progress to the RTAS validation:
>
> --- ofmem_common.c.ORIG 2011-08-08 17:04:25.375000000 -0400
> +++ ofmem_common.c      2011-08-08 17:04:45.875000000 -0400
> @@ -172,8 +172,9 @@
>          d->next = ofmem->mfree;
>
>          /* insert in the (sorted) freelist */
> -       for( pp=&ofmem->mfree; *pp&&  (**pp).size<  d->size ; pp =&(**pp).next
> ) {
> -       }
> +/*     for( pp=&ofmem->mfree; *pp&&  (**pp).size<  d->size ; pp =&(**pp).next
> ) {
> +       } */
> +       pp=&ofmem->mfree;
>
>          d->next = *pp;
>          *pp = d;
>
>
> Before I made the above change, the following is what I saw in gdb and qemu console:
>
>    (qemu) info cpus
>    * CPU #0: nip=0x00000000fff91a84 thread_id=6828
>
>    (gdb)
>    0x00000000fff91a84 in ofmem_free (ptr=0x3fca1774)
>        at ../libopenbios/ofmem_common.c:175
>    175   for( pp=&ofmem->mfree; *pp&&  (**pp).size<  d->size ; pp =&(**pp).next ) {
>
>    (gdb) print /x pp
>    $1 = 0x3fc9e0ac
>
> The value 0x3fc9e0ac can be found in register GPR08:
>
>    (qemu) info registers
>    NIP 00000000fff91a84   LR 00000000fff91a58 CTR 00000000fff93784 XER 000000002000
> 0000
>    MSR 0000000000003032 HID0 0000000060000000  HF 0000000000002000 idx 1
>    TB 00000001 7638146663 DECR 951787926
>    GPR00 000000003fca1764 000000003fdf69e0 00000000fffc68e8 000000003fc9e0ac
>    GPR04 00000000fffc0088 000000003fc5bc68 00000000fffc0860 0000000000044200
>    GPR08 0000000000000002 000000003fc9e0ac 0000000000000024 0000000000000810
>    GPR12 00000000000088ac 0000000000000000 00000000fffb6703 00000000fffb7b65
>    GPR16 00000000fffb8331 00000000fffb6706 0000000004000000 00000000fffbf6b8
>    GPR20 00000000fffbf634 00000000fffc68e8 00000000fffbf634 00000000fffb650a
>    GPR24 00000000fffb64f8 00000000fffb6478 00000000fffb6500 00000000fffb6505
>    GPR28 00000000fffb707e 0000000000000027 0000000000000027 000000003fca1774
>    CR 48000084  [ G  L  -  -  -  -  L  G  ]             RES ffffffffffffffff
>    FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>    FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>    FPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>    FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>    FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>    FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>    FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>    FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>    FPSCR 00000000
>    SRR0 00000000fffaa590  SRR1 0000000000003032    PVR 00000000003c0301 VRSAVE 000
>    0000000000000
>    SPRG0 000000003fe00000 SPRG1 000000003fdf6630  SPRG2 0000000000000000  SPRG3 000
>    0000000000000
>    SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  SPRG7 000
>    0000000000000
>    SDR1 000000003fe00000
>
>
> Here is my QEMU command:
>
>     ./testing/qemu/ppc64-softmmu/qemu-system-ppc64 \
>             -L ./testing/qemu/pc-bios \
>             -m 1024 \
>             -bios ./testing/openbios-devel/obj-ppc64/openbios-qemu.elf
>             -drive file=images/aix.img,index=0,media=disk,cache=writeback
>             -cdrom images/ibmaix.iso \
>             -boot d \
>             -nographic \
>             -rtc base=localtime,clock=host \
>             -uuid 17202d0a-45f8-4159-a8e1-78b866f50aa7 \
>             -serial tcp::9979,server,nowait \
>             -monitor tcp::9980,server,nowait \
>             -gdb tcp::1234
>
>    powerpc64-unknown-linux-gnu-gdb testing/openbios-devel/obj-ppc64/openbios-qemu.elf-nostrip
>
> I don't really know what I'm doing so any help explaining this function in ofmem_common.c would be appreciated.

Hi Ken,

I didn't write the original version of this code, however I have played 
around with it enough to get an idea of what it does, so hopefully the 
explanation below will make sense ;)

ofmem_malloc() and ofmem_free() are mapped to malloc() and free() calls 
used within OpenBIOS. They are not used by any client program. If an 
area of memory is requested via CIF then the allocation is handled 
separate using the arch/*/ OFMEM code which will handle memory 
allocations/mapping across the entire address range.

As per the code in libopenbios/ofmem_common.c, ofmem_malloc() allocates 
memory within the OpenBIOS binary image between 
ofmem_arch_get_malloc_base() and ofmem_arch_get_heap_top(). These limits 
are also specified per architecture within the arch/*/ OFMEM handler.

In order to facilitate memory re-use, if a chunk of memory is 
ofmem_free()d then it is added to a free memory linked-list starting at 
ofmem->mfree ordered by size. The ofmem_malloc() code at line 105 
consults this list first when trying to allocate memory in order to try 
and ensure that repeated ofmem_malloc() and ofmem_free() calls on the 
same size area of memory don't deplete the available memory store.

Finally, ofmem_malloc() must ensure that the returned pointer is aligned 
both physically and virtually, and the so the memory descriptor for an 
entry is stored just before it in memory like this:


                   | <- aligned ptr
    ----------------------------------------
    | alloc_desc_t | region (<size> bytes) |
    ----------------------------------------


The code you are having trouble with is the code that tries to add the 
ofmem_free() location back into the freelist starting at ofmem->mfree.

Now if ofmem->mfree is getting clobbered somehow then that would cause 
the linked list to jump to random locations in memory and hence cause a 
crash. In these cases the tactic is to add a watchpoint in gdb at 
ofmem->mfree so that gdb breaks whenever that memory location is written 
to. Assuming you compile OpenBIOS with -O0 -g then you can obtain a 
backtrace in order to find out where the error is occurring and post it 
back to the list if you need further help.


HTH,

Mark.

-- 
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs



More information about the OpenBIOS mailing list