[coreboot] [patch]: memcpy/memset inline asm & config_compress disabled when needed
ron minnich
rminnich at gmail.com
Mon Feb 22 17:29:11 CET 2010
On Sun, Feb 21, 2010 at 8:48 PM, Bao, Zheng <Zheng.Bao at amd.com> wrote:
>
> Index: src/lib/memcpy.c
> ===================================================================
> --- src/lib/memcpy.c (revision 5133)
> +++ src/lib/memcpy.c (working copy)
> @@ -3,10 +3,14 @@
> {
> const char *src = vsrc;
> char *dest = vdest;
> - int i;
>
> - for (i = 0; i < (int)bytes; i++)
> - dest[i] = src[i];
> + __asm__ __volatile__ ( \
> + "cld \n\t" \
> + "rep \n\t" \
> + "movsb" \
> + : /* No output */ \
> + : "S"(src), "D"(dest), "c"(bytes) \
> + );
>
> return vdest;
> }
> Index: src/lib/memset.c
> ===================================================================
> --- src/lib/memset.c (revision 5133)
> +++ src/lib/memset.c (working copy)
> @@ -2,11 +2,15 @@
>
> void *memset(void *s, int c, size_t n)
> {
> - int i;
> char *ss = (char *) s;
>
> - for (i = 0; i < (int)n; i++)
> - ss[i] = c;
> + __asm__ __volatile__ ( \
> + "cld\n\t" \
> + "rep\n\t" \
> + "stosb" \
> + : \
> + : "a"(c), "D"(ss), "c"(n) \
> + );
>
> return s;
> }
I'm glad this works, but I am afraid I have a concern about it.
I've become opposed to inline assembly on several principles in the
last few years:
- we've seen one problem after another as the manner in which __asm__
is supposed to be used has varied as gcc changes.
we've also had code that had errors in __asm__ which nobody noticed.
- I'd like to understand why the C code is so much worse. I can't see
a good reason for C to be so bad at this simple task.
For example, with the memset, what if we just created a memzero
function instead, which used a constant '0' instead of a parameter:
does the code improve?
- I now feel a .c file should contain C. If we need a .s, let's create a .s.
So, first, can we please have a look at the .s produced by the C code
and see if we can understand why it is so slow?
Thanks
Ron
More information about the coreboot
mailing list