- 08 Nov, 2016 1 commit
-
-
Frank Barchard authored
BUG=libyuv:654 R=kjellander@chromium.org Review URL: https://codereview.chromium.org/2469353005 .
-
- 07 Nov, 2016 1 commit
-
-
Frank Barchard authored
Improved unittests detect different in arm64 rounding. TEST=util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose --release --gtest_filter=*Half* -a "--libyuv_width=640 --libyuv_height=360" BUG=libyuv:560 R=wangcheng@google.com Review URL: https://codereview.chromium.org/2478313004 .
-
- 01 Nov, 2016 1 commit
-
-
Frank Barchard authored
64 bit version made similar to 32 bit with registers 1 for load and store results, and 2 and 3 as expanded float temporary values. TEST=out/Release/libyuv_unittest --gtest_filter=*Half* BUG=libyuv:560 R=wangcheng@google.com Review URL: https://codereview.chromium.org/2467723002 .
-
- 27 Oct, 2016 1 commit
-
-
Frank Barchard authored
R=fbarchard@google.com BUG=libyuv:634 Performance Gain (vs C vectorized) I422ToRGB565Row_MSA : ~1.5x I422ToRGB565Row_Any_MSA : ~1.5x I422ToARGB4444Row_MSA : ~1.4x I422ToARGB4444Row_Any_MSA : ~1.4x I422ToARGB1555Row_MSA : ~1.4x I422ToARGB1555Row_Any_MSA : ~1.4x Performance Gain (vs C non-vectorized) I422ToRGB565Row_MSA : ~6.8x I422ToRGB565Row_Any_MSA : ~6.8x I422ToARGB4444Row_MSA : ~6.6x I422ToARGB4444Row_Any_MSA : ~6.6x I422ToARGB1555Row_MSA : ~6.6x I422ToARGB1555Row_Any_MSA : ~6.6x Review URL: https://codereview.chromium.org/2445343007 .
-
- 26 Oct, 2016 3 commits
-
-
Frank Barchard authored
R=fbarchard@google.com BUG=libyuv:634 Performance Gain (vs C vectorized) I422AlphaToARGBRow_MSA : ~1.4x I422AlphaToARGBRow_Any_MSA : ~1.4x I422ToRGB24Row_MSA : ~4.8x I422ToRGB24Row_Any_MSA : ~4.8x Performance Gain (vs C non-vectorized) I422AlphaToARGBRow_MSA : ~7.0x I422AlphaToARGBRow_Any_MSA : ~7.0x I422ToRGB24Row_MSA : ~7.9x I422ToRGB24Row_Any_MSA : ~7.7x Review URL: https://codereview.chromium.org/2454433003 .
-
Frank Barchard authored
BUG=libyuv:634 TEST=git cl lint TBR=kjellander@chromium.org Review URL: https://codereview.chromium.org/2453013003 .
-
Frank Barchard authored
BUG=libyuv:643 TEST=gn gen out/Release "--args=is_debug=false target_os=\"ios\" ios_enable_code_signing=false target_cpu=\"arm64\"" && ninja -v -C out/Release libyuv_unittest R=kjellander@chromium.org Review URL: https://codereview.chromium.org/2450853003 .
-
- 25 Oct, 2016 3 commits
-
-
Frank Barchard authored
DEPS roll is needed for mips builds. These additional changes are also needed for that DEPS roll. These can be done separately. TBR=kjellander@chromium.org BUG=libyuv:634 TEST=try bots Review URL: https://codereview.chromium.org/2446043003 .
-
Frank Barchard authored
no functional changes. TBR=kjellander@chromium.org BUG=libyuv:634 Review URL: https://codereview.chromium.org/2446313002 .
-
Frank Barchard authored
Debug builds of x86 gcc/clang can run out of register. Previously NDEBUG or _DEBUG was used to detect a debug build. But those macros are not set by gentoo builds. This CL switches to the compiler predefine __OPTIMIZE__ which is built into clang and gcc. BUG=libyuv:602 TEST=untested R=wangcheng@google.com Review URL: https://codereview.chromium.org/2451503002 .
-
- 24 Oct, 2016 1 commit
-
-
Frank Barchard authored
R=fbarchard@google.com BUG=libyuv:634 Performance Gains :- (vs C vectorized) I422ToARGBRow_MSA : ~1.6x I422ToRGBARow_MSA : ~1.6x I422ToARGBRow_Any_MSA : ~1.58x I422ToRGBARow_Any_MSA : ~1.6x Performance Gains :- (vs C non-vectorized) I422ToARGBRow_MSA : ~7x I422ToRGBARow_MSA : ~7x I422ToARGBRow_Any_MSA : ~6.9x I422ToRGBARow_Any_MSA : ~6.8x Regarding performance measurement, We have created standalone tests which pass in row's data from a 1920x1080 filled buffer to both the C and MSA functions. And such N iterations are executed to get more accurate timings of C vs MSA. Review URL: https://codereview.chromium.org/2430313005 .
-
- 21 Oct, 2016 1 commit
-
-
Frank Barchard authored
void HalfFloat1Row_NEON(const uint16* src, uint16* dst, float, int width) { asm volatile ( "1: \n" MEMACCESS(0) "ld1 {v1.16b}, [%0], #16 \n" // load 8 shorts "subs %w2, %w2, #8 \n" // 8 pixels per loop "uxtl v2.4s, v1.4h \n" // 8 int's "uxtl2 v1.4s, v1.8h \n" "scvtf v2.4s, v2.4s \n" // 8 floats "scvtf v1.4s, v1.4s \n" "fcvtn v4.4h, v2.4s \n" // 8 floatsgit "fcvtn2 v4.8h, v1.4s \n" MEMACCESS(1) "st1 {v4.16b}, [%1], #16 \n" // store 8 shorts "b.gt 1b \n" : "+r"(src), // %0 "+r"(dst), // %1 "+r"(width) // %2 : : "cc", "memory", "v1", "v2", "v4" ); } void HalfFloatRow_NEON(const uint16* src, uint16* dst, float scale, int width) { asm volatile ( "1: \n" MEMACCESS(0) "ld1 {v1.16b}, [%0], #16 \n" // load 8 shorts "subs %w2, %w2, #8 \n" // 8 pixels per loop "uxtl v2.4s, v1.4h \n" // 8 int's "uxtl2 v1.4s, v1.8h \n" "scvtf v2.4s, v2.4s \n" // 8 floats "scvtf v1.4s, v1.4s \n" "fmul v2.4s, v2.4s, %3.s[0] \n" // adjust exponent "fmul v1.4s, v1.4s, %3.s[0] \n" "uqshrn v4.4h, v2.4s, #13 \n" // isolate halffloat "uqshrn2 v4.8h, v1.4s, #13 \n" MEMACCESS(1) "st1 {v4.16b}, [%1], #16 \n" // store 8 shorts "b.gt 1b \n" : "+r"(src), // %0 "+r"(dst), // %1 "+r"(width) // %2 : "w"(scale * 1.9259299444e-34f) // %3 : "cc", "memory", "v1", "v2", "v4" ); } TEST=LibYUVPlanarTest.TestHalfFloatPlane_One BUG=libyuv:560 R=hubbe@chromium.org Review URL: https://codereview.chromium.org/2430313008 .
-
- 20 Oct, 2016 2 commits
-
-
Frank Barchard authored
AVX unpack parameters were reverse ordered causing incorrect results on AVX2 hardware. TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=*Half* BUG=libyuv:560 R=wangcheng@google.com Review URL: https://codereview.chromium.org/2438893002 .
-
Frank Barchard authored
Halffloats have a limited range. It shouldnt normally come up, but if the scale value passed in produces a small value, the half floats will be denormals, which are slow and/or flust to zero. This test ensures they behave the same in C and SIMD and tests the performance of denormals. TEST=TestHalfFloatPlane_denormal BUG=libyuv:560 R=hubbe@chromium.org Review URL: https://codereview.chromium.org/2424233004 .
-
- 19 Oct, 2016 1 commit
-
-
Frank Barchard authored
R=fbarchard@google.com BUG=libyuv:634 Performance gains : (Auto-vectorized C vs MSA SIMD) ARGB4444ToYRow_MSA : ~3.0x ARGB4444ToUVRow_MSA : ~1.8x ARGB4444ToARGBRow_MSA : ~3.4x ARGB4444ToYRow_Any_MSA : ~2.8x ARGB4444ToUVRow_Any_MSA : ~1.7x ARGB4444ToARGBRow_Any_MSA : ~3.2x Review URL: https://codereview.chromium.org/2421843002 .
-
- 18 Oct, 2016 3 commits
-
-
Frank Barchard authored
remove old comment about initialize to zero. remove ifdef and replace with macro defined to zero. BUG=None TEST=try bots R=kjellander@chromium.org Review URL: https://codereview.chromium.org/2425623004 .
-
Henrik Kjellander authored
BUG=chromium:652188 TBR=fbarchard@chromium.org Review URL: https://codereview.chromium.org/2427643003 .
-
Henrik Kjellander authored
As they're being removed from the try server. BUG=chromium:652188 TBR=fbarchard@chromium.org Review URL: https://codereview.chromium.org/2426693003 .
-
- 17 Oct, 2016 3 commits
-
-
Henrik Kjellander authored
BUG=chromium:652188 TBR=ehmaldonado@chromium.org Review URL: https://codereview.chromium.org/2421343002 .
-
Henrik Kjellander authored
After switching bots from GYP to GN, build artifacts are left that fails the next builds. Since it's unfeasible to clean out all bot machines it's better to have an automated system for this, which is what landmines is. By adding a line to tools/get_landmines.py it is possible to clobber each bot that syncs past that "landmine CL". BUG=chromium:652188 TBR=ehmaldonado@chromium.org Review URL: https://codereview.chromium.org/2427633003 .
-
Henrik Kjellander authored
After switching the default bots from GYP to GN, we now only have a few GYP bots left, so rename the trybots accordingly BUG=chromium:652188 TBR=fbarchard@chromium.org Review URL: https://codereview.chromium.org/2425693002 .
-
- 15 Oct, 2016 1 commit
-
-
Frank Barchard authored
R=wangcheng@google.com, hubbe@chromium.org BUG=libyuv:560 Review URL: https://codereview.chromium.org/2421993002 .
-
- 14 Oct, 2016 2 commits
-
-
Frank Barchard authored
R=wangcheng@google.com, hubbe@chromium.org BUG=libyuv:560 Review URL: https://codereview.chromium.org/2418763006 .
-
Frank Barchard authored
BUG=libyuv:572 TEST=try bots R=wangcheng@google.com, magjed@chromium.org Review URL: https://codereview.chromium.org/2416783004 .
-
- 13 Oct, 2016 2 commits
-
-
Frank Barchard authored
Port SSE2 version to AVX2. BUG=libyuv:572 TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=*Extract* R=wangcheng@google.com, magjed@chromium.org Review URL: https://codereview.chromium.org/2420553002 .
-
Frank Barchard authored
This variable was introduced in https://codereview.chromium.org/2293853002 and causes builds to fail, since is not defined in WebRTC. BUG=webrtc:6281 TBR=kjellander@chromium.org Review URL: https://codereview.chromium.org/2418643003 .
-
- 12 Oct, 2016 2 commits
-
-
Frank Barchard authored
R=kjellander@chromium.org BUG=libyuv:649 Review URL: https://codereview.chromium.org/2414763002 .
-
Frank Barchard authored
TBR=kjellander@chromium.org BUG=libyuv:649 TEST=call gn gen out\Release "--args=is_debug=false is_clang=true" Review URL: https://codereview.chromium.org/2414783002 .
-
- 11 Oct, 2016 2 commits
-
-
Frank Barchard authored
YUV 411 is very uncommon format. Remove support. Update documentation to reflect that 411 is deprecated. Simplify tests for YUV to only test with the new side by side YUV but keep old 3 plane test around with a macro for now. BUG=libyuv:645 R=kjellander@chromium.org Review URL: https://codereview.chromium.org/2406123002 .
-
Frank Barchard authored
I420 output can be slow due to multi channel write. Putting the U and V into a single side by side buffer can improve performance. TBR=wangcheng@google.com BUG=None Review URL: https://codereview.chromium.org/2403223003 .
-
- 08 Oct, 2016 2 commits
-
-
Frank Barchard authored
TBR=wangcheng@google.com BUG=libyuv:647 TESTED=LibYUVConvertTest.YUY2ToI422_Opt Review URL: https://codereview.chromium.org/2393393006 .
-
Frank Barchard authored
This function is the first step of YUY2 To I420. Provided primarily for diagnostics. TBR=wangcheng@google.com BUG=libyuv:647 TESTED=LibYUVConvertTest.YUY2ToY_Opt Review URL: https://codereview.chromium.org/2399153004 .
-
- 07 Oct, 2016 1 commit
-
-
Frank Barchard authored
R=fbarchard@google.com BUG=libyuv:634 Performance gains as below, YUY2ToI422, YUY2ToI420 :- YUY2ToYRow_MSA : ~10x YUY2ToUVRow_MSA : ~11x YUY2ToUV422Row_MSA : ~9x YUY2ToYRow_Any_MSA : ~6x YUY2ToUVRow_Any_MSA : ~5x YUY2ToUV422Row_Any_MSA : ~4x UYVYToI422, UYVYToI420 :- UYVYToYRow_MSA : ~10x UYVYToUVRow_MSA : ~11x UYVYToUV422Row_MSA : ~9x UYVYToYRow_Any_MSA : ~6x UYVYToUVRow_Any_MSA : ~5x UYVYToUV422Row_Any_MSA : ~4x Review URL: https://codereview.chromium.org/2397693002 .
-
- 06 Oct, 2016 1 commit
-
-
Frank Barchard authored
YUY2ToI422_Any_Neon previously required 16 pixels and duplicated the last pixel. The replication was not necessary after a previous change to treat YUY2 to 4 byte macro pixels. TBR=harryjin@google.com BUG=libyuv:648 TESTED=util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose --release --gtest_filter=*YUY2ToI422* -a "--libyuv_width=17 --libyuv_height=7 --libyuv_repeat=999 --libyuv_flags=1" Review URL: https://codereview.chromium.org/2399143002 .
-
- 05 Oct, 2016 1 commit
-
-
Frank Barchard authored
This reduces the number of objects when not specifying a build target during compile. This is especially significant for Android where the number of objects decreases from 3322 to 1761. BUG=libyuv:644 R=fbarchard@google.com Review URL: https://codereview.chromium.org/2395743002 .
-
- 04 Oct, 2016 2 commits
-
-
Frank Barchard authored
Optimize max enables O2 for official builds. Normally release builds are O2 but the official build is Os, affecting performance. The GYP file was previously updated to enable optimize max, which enables ltcg and O2. Documentation updated to show GN builds in docs/getting_started.md BUG=libyuv:642 R=kjellander@chromium.org Review URL: https://codereview.chromium.org/2386093003 .
-
Frank Barchard authored
R=fbarchard@google.com BUG=libyuv:634 Performance gains :- I422ToYUY2Row_MSA - ~12x I422ToYUY2Row_Any_MSA - ~7x I422ToUYVYRow_MSA - ~12x I422ToUYVYRow_Any_MSA - ~7x Review URL: https://codereview.chromium.org/2378753004 .
-
- 03 Oct, 2016 1 commit
-
-
Frank Barchard authored
Low level support for 12 bit 420, 422 and 444 YUV video frame conversion. BUG=libyuv:560, chromium:445071 TEST=LibYUVPlanarTest.TestHalfFloatPlane on windows R=hubbe@chromium.org, wangcheng@google.com Review URL: https://codereview.chromium.org/2387713002 .
-
- 30 Sep, 2016 1 commit
-
-
Frank Barchard authored
Low level support for 12 bit 420, 422 and 444 YUV video frame conversion. BUG=libyuv:560, chromium:445071 TEST=untested R=hubbe@chromium.org Review URL: https://codereview.chromium.org/2381493006 .
-
- 29 Sep, 2016 1 commit
-
-
Frank Barchard authored
BUG=libyuv:560,chromium:445071 TEST=untested R=hubbe@chromium.org Review URL: https://codereview.chromium.org/2371293002 .
-