[coreboot] [patch]: memcpy/memset inline asm & config_compress disabled when needed

ron minnich rminnich at gmail.com
Mon Feb 22 17:29:11 CET 2010


On Sun, Feb 21, 2010 at 8:48 PM, Bao, Zheng <Zheng.Bao at amd.com> wrote:


>
> Index: src/lib/memcpy.c
> ===================================================================
> --- src/lib/memcpy.c    (revision 5133)
> +++ src/lib/memcpy.c    (working copy)
> @@ -3,10 +3,14 @@
>  {
>        const char *src = vsrc;
>        char *dest = vdest;
> -       int i;
>
> -       for (i = 0; i < (int)bytes; i++)
> -               dest[i] = src[i];
> +       __asm__ __volatile__ (                          \
> +               "cld \n\t"                              \
> +               "rep \n\t"                              \
> +               "movsb"                                 \
> +               :               /* No output */         \
> +               : "S"(src), "D"(dest), "c"(bytes)       \
> +               );
>
>        return vdest;
>  }
> Index: src/lib/memset.c
> ===================================================================
> --- src/lib/memset.c    (revision 5133)
> +++ src/lib/memset.c    (working copy)
> @@ -2,11 +2,15 @@
>
>  void *memset(void *s, int c, size_t n)
>  {
> -       int i;
>        char *ss = (char *) s;
>
> -       for (i = 0; i < (int)n; i++)
> -               ss[i] = c;
> +       __asm__ __volatile__ (                  \
> +               "cld\n\t"                       \
> +               "rep\n\t"                       \
> +               "stosb"                         \
> +               :                               \
> +               : "a"(c), "D"(ss), "c"(n)       \
> +               );
>
>        return s;
>  }

I'm glad this works, but I am afraid I have a concern about it.

I've become opposed to inline assembly on several principles in the
last few years:

- we've seen one problem after another as the manner in which __asm__
is supposed to be used has varied as gcc changes.
  we've also had code that had errors in __asm__ which nobody noticed.

- I'd like to understand why the C code is so much worse. I can't see
a good reason for C to be so bad at this simple task.
For example, with the memset, what if we just created a memzero
function instead, which used a constant '0' instead of a parameter:
does the code improve?

- I now feel a .c file should contain C. If we need a .s, let's create a .s.

So, first, can we please have a look at the .s produced by the C code
and see if we can understand why it is so slow?

Thanks

Ron




More information about the coreboot mailing list