Commit 78ad8d1f authored by fbarchard@google.com's avatar fbarchard@google.com

Polynomial AVX2 on gcc use vex128 vmovq instead of SSE2 movq to avoid stall.

BUG=265
TEST=unittest polynomial
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2679004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@816 16f28f9a-4ce2-e073-06de-1de4eb20be90
parent a03b8add
...@@ -6068,7 +6068,7 @@ void ARGBPolynomialRow_AVX2(const uint8* src_argb, ...@@ -6068,7 +6068,7 @@ void ARGBPolynomialRow_AVX2(const uint8* src_argb,
"vpermq $0xd8,%%ymm0,%%ymm0 \n" "vpermq $0xd8,%%ymm0,%%ymm0 \n"
"vpackuswb %%xmm0,%%xmm0,%%xmm0 \n" "vpackuswb %%xmm0,%%xmm0,%%xmm0 \n"
"sub $0x2,%2 \n" "sub $0x2,%2 \n"
"movq %%xmm0,"MEMACCESS(1)" \n" "vmovq %%xmm0,"MEMACCESS(1)" \n"
"lea "MEMLEA(0x8,1)",%1 \n" "lea "MEMLEA(0x8,1)",%1 \n"
"jg 1b \n" "jg 1b \n"
"vzeroupper \n" "vzeroupper \n"
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment