1. 11 Sep, 2017 1 commit
  2. 11 Feb, 2017 1 commit
  3. 02 Feb, 2017 1 commit
  4. 20 Jan, 2017 1 commit
  5. 13 Jan, 2017 1 commit
    • Frank Barchard's avatar
      add Intel Code Analyst markers · a7c87e19
      Frank Barchard authored
      add macros to enable/disable code analyst around blocks of code.
      
      Normally these macros should not be used, but if performance
      details are wanted for intel code, enable them around the code
      and then run via the iaca tool, available on the intel website.
      
      BUG=libyuv:670
      TEST=~/iaca-lin64/bin/iaca.sh -64 out/Release/libyuv_unittest
      R=wangcheng@google.com
      
      Review-Url: https://codereview.chromium.org/2626193002 .
      a7c87e19
  6. 08 Nov, 2016 2 commits
  7. 25 Oct, 2016 1 commit
  8. 21 Oct, 2016 1 commit
    • Frank Barchard's avatar
      scale by 1 for neon implemented · 451af5e9
      Frank Barchard authored
      void HalfFloat1Row_NEON(const uint16* src, uint16* dst, float, int width) {
        asm volatile (
        "1:                                          \n"
          MEMACCESS(0)
          "ld1        {v1.16b}, [%0], #16            \n"  // load 8 shorts
          "subs       %w2, %w2, #8                   \n"  // 8 pixels per loop
          "uxtl       v2.4s, v1.4h                   \n"  // 8 int's
          "uxtl2      v1.4s, v1.8h                   \n"
          "scvtf      v2.4s, v2.4s                   \n"  // 8 floats
          "scvtf      v1.4s, v1.4s                   \n"
          "fcvtn      v4.4h, v2.4s                   \n"  // 8 floatsgit
          "fcvtn2     v4.8h, v1.4s                   \n"
         MEMACCESS(1)
          "st1        {v4.16b}, [%1], #16            \n"  // store 8 shorts
          "b.gt       1b                             \n"
        : "+r"(src),    // %0
          "+r"(dst),    // %1
          "+r"(width)   // %2
        :
        : "cc", "memory", "v1", "v2", "v4"
        );
      }
      
      void HalfFloatRow_NEON(const uint16* src, uint16* dst, float scale, int width) {
        asm volatile (
        "1:                                          \n"
          MEMACCESS(0)
          "ld1        {v1.16b}, [%0], #16            \n"  // load 8 shorts
          "subs       %w2, %w2, #8                   \n"  // 8 pixels per loop
          "uxtl       v2.4s, v1.4h                   \n"  // 8 int's
          "uxtl2      v1.4s, v1.8h                   \n"
          "scvtf      v2.4s, v2.4s                   \n"  // 8 floats
          "scvtf      v1.4s, v1.4s                   \n"
          "fmul       v2.4s, v2.4s, %3.s[0]          \n"  // adjust exponent
          "fmul       v1.4s, v1.4s, %3.s[0]          \n"
          "uqshrn     v4.4h, v2.4s, #13              \n"  // isolate halffloat
          "uqshrn2    v4.8h, v1.4s, #13              \n"
         MEMACCESS(1)
          "st1        {v4.16b}, [%1], #16            \n"  // store 8 shorts
          "b.gt       1b                             \n"
        : "+r"(src),    // %0
          "+r"(dst),    // %1
          "+r"(width)   // %2
        : "w"(scale * 1.9259299444e-34f)    // %3
        : "cc", "memory", "v1", "v2", "v4"
        );
      }
      
      TEST=LibYUVPlanarTest.TestHalfFloatPlane_One
      BUG=libyuv:560
      R=hubbe@chromium.org
      
      Review URL: https://codereview.chromium.org/2430313008 .
      451af5e9
  9. 20 Oct, 2016 1 commit
  10. 15 Oct, 2016 1 commit
  11. 13 Oct, 2016 1 commit
  12. 11 Oct, 2016 1 commit
    • Frank Barchard's avatar
      Remove I411 support. · d363ea65
      Frank Barchard authored
      YUV 411 is very uncommon format.  Remove support.
      
      Update documentation to reflect that 411 is deprecated.
      
      Simplify tests for YUV to only test with the new side by side YUV but keep old 3 plane test around with a macro for now.
      
      BUG=libyuv:645
      R=kjellander@chromium.org
      
      Review URL: https://codereview.chromium.org/2406123002 .
      d363ea65
  13. 03 Oct, 2016 1 commit
  14. 30 Sep, 2016 1 commit
  15. 29 Sep, 2016 1 commit
  16. 14 Jun, 2016 1 commit
    • Frank Barchard's avatar
      android_full_debug x86 fix - use +rm for width count · fd3e676e
      Frank Barchard authored
      Work around for android full debug build runnign out of registers.
      5 functions were running out of registers causing the compiler error
      error: 'asm' operand has impossible constraints
      These functions mostly have 4 pointers, a counter (width) and a tempory
      eax register.  With fpic and debug using stackframes, 2 registers are
      unavailable.  So a total of 8 registers are used.
      Although fpic and stack frame dont apply to assembly, the compiler
      reserves 2 registers.  The optimized version builds, so its likely
      freeing up the registers once it knows they are not used.
      These functions used to build, so compile options and/or compiler may
      have updated.. likely fpic was turned on.
      An attribute can be done to disable each, and will avoid using the
      2 GPR registers, but they are still reserved and unavailable in debug
      builds on current compilers (gcc 4.9 and clang 3.8).
      
      R=dhrosa@google.com
      BUG=libyuv:602
      
      Review URL: https://codereview.chromium.org/2066933002 .
      fd3e676e
  17. 26 May, 2016 1 commit
  18. 18 Apr, 2016 1 commit
  19. 18 Feb, 2016 1 commit
  20. 12 Feb, 2016 1 commit
  21. 01 Feb, 2016 1 commit
    • Frank Barchard's avatar
      ubsan overflow fix for multiply by 0x01010101 · 9e39c1f2
      Frank Barchard authored
      This is an UBSan error reported by libjingle
      
      [ RUN      ] WebRtcVideoFrameTest.ConvertToYUY2BufferStride
      [000:000] (videoframe.cc:375): Validate frame passed. format: I420 bpp: 12 size: 1280x720 bytes: 1382400 expected: 1382400 sample[0..3]: 73, 73, 73, 73
      ../../chromium/src/third_party/libyuv/source/row_gcc.cc:2903:25: runtime error: signed integer overflow: 128 * 16843009 cannot be represented in type 'int'
      [8/614] WebRtcVideoFrameTest.ConvertToYUY2BufferStride returned/aborted with exit code 1 (32 ms)
      [9/614] WebRtcVideoFrameTest.ConvertToYUY2BufferInverted (29 ms)
      Note: Google Test filter = WebRtcVideoFrameTest.ConvertToYUY2BufferInverted
      
      The source is uint8 and the multiply is by 0x01010101 to replicate the byte to 4 bytes.
      Changing the constant to 0x01010101u should avoid overflow.
      
      R=harryjin@google.com
      TBR=harryjin@google.com
      BUG=libyuv:563
      
      Review URL: https://codereview.chromium.org/1657533005 .
      9e39c1f2
  22. 13 Jan, 2016 1 commit
  23. 12 Jan, 2016 1 commit
  24. 22 Dec, 2015 2 commits
  25. 21 Dec, 2015 1 commit
  26. 17 Dec, 2015 2 commits
  27. 09 Dec, 2015 2 commits
  28. 06 Dec, 2015 1 commit
  29. 04 Dec, 2015 1 commit
  30. 19 Nov, 2015 2 commits
  31. 18 Nov, 2015 2 commits
  32. 17 Nov, 2015 1 commit
  33. 14 Nov, 2015 1 commit
    • Frank Barchard's avatar
      port I444ToARGB avx2 code from Visual C to GCC. · 1019e453
      Frank Barchard authored
      SSSE3
      Note: Google Test filter = *I444ToARGB*
      [==========] Running 8 tests from 1 test case.
      [----------] Global test environment set-up.
      [----------] 8 tests from LibYUVConvertTest
      [ RUN      ] LibYUVConvertTest.I444ToARGB_Any
      [       OK ] LibYUVConvertTest.I444ToARGB_Any (435 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_Unaligned
      [       OK ] LibYUVConvertTest.I444ToARGB_Unaligned (418 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_Invert
      [       OK ] LibYUVConvertTest.I444ToARGB_Invert (417 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_Opt
      [       OK ] LibYUVConvertTest.I444ToARGB_Opt (411 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Any
      [       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (419 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
      [       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (432 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
      [       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (435 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
      [       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (421 ms)
      [----------] 8 tests from LibYUVConvertTest (3389 ms total)
      
      AVX2
      Note: Google Test filter = *I444ToARGB*
      [==========] Running 8 tests from 1 test case.
      [----------] Global test environment set-up.
      [----------] 8 tests from LibYUVConvertTest
      [ RUN      ] LibYUVConvertTest.I444ToARGB_Any
      [       OK ] LibYUVConvertTest.I444ToARGB_Any (340 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_Unaligned
      [       OK ] LibYUVConvertTest.I444ToARGB_Unaligned (325 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_Invert
      [       OK ] LibYUVConvertTest.I444ToARGB_Invert (316 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_Opt
      [       OK ] LibYUVConvertTest.I444ToARGB_Opt (316 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Any
      [       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (315 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
      [       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (341 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
      [       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (331 ms)
      [ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
      [       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (329 ms)
      [----------] 8 tests from LibYUVConvertTest (2615 ms total)
      
      TBR=harryjin@google.com
      BUG=libyuv:492
      
      Review URL: https://codereview.chromium.org/1445893002 .
      1019e453
  34. 05 Nov, 2015 1 commit
    • Frank Barchard's avatar
      YUV to RGB for x64 use registers instead of memory. · 431cb366
      Frank Barchard authored
      On Arm the YVU to RGB conversions move constants into registers.
      This change does the same for 64 bit intel builds where additional
      registers are available.
      The AVX2 saves 3 instructions by because the 2nd argument needs to be a register, so a vmovdqu was avoided.
      
      x64 builds using memory:
      AVX2  I420ToARGB_Opt (3059 ms)
      SSSE3 I420ToARGB_Opt (3959 ms)
      
      Now using registers
      AVX2  I420ToARGB_Opt (2906 ms)
      SSSE3 I420ToARGB_Opt (3928 ms)
      
      TBR=harryjin@google.com
      BUG=libyuv:520
      
      Review URL: https://codereview.chromium.org/1407353010 .
      431cb366