On Sun, Apr 17, 2016 at 4:29 PM, Paul Menzel < paulepanter@users.sourceforge.net> wrote:
Dear Kyösti,
It’s now down from around 1,379 ms to 380 ms, which is even faster than the 520 ms before the regression (and 495 ms) before that.
Thanks a lot!
Are the patches just proof of concept, or ready to be reviewed on Gerrit?
Proof of concept and some explanation what might be going on there. As noted, I think we can go and compile Lib/amdlib.c with -O2 flag, so there is no need to use the third patch with per-function align-attributes.
Kyösti