1. 13 Jan, 2017 1 commit
    • Frank Barchard's avatar
      add Intel Code Analyst markers · a7c87e19
      Frank Barchard authored
      add macros to enable/disable code analyst around blocks of code.
      
      Normally these macros should not be used, but if performance
      details are wanted for intel code, enable them around the code
      and then run via the iaca tool, available on the intel website.
      
      BUG=libyuv:670
      TEST=~/iaca-lin64/bin/iaca.sh -64 out/Release/libyuv_unittest
      R=wangcheng@google.com
      
      Review-Url: https://codereview.chromium.org/2626193002 .
      a7c87e19
  2. 19 Dec, 2016 1 commit
  3. 14 Dec, 2016 1 commit
  4. 07 Dec, 2016 1 commit
  5. 22 Nov, 2016 1 commit
    • Frank Barchard's avatar
      Add MSA optimized ARGBToRGB565Row_MSA, ARGBToARGB1555Row_MSA,… · da0c29da
      Frank Barchard authored
      Add MSA optimized ARGBToRGB565Row_MSA, ARGBToARGB1555Row_MSA, ARGBToARGB4444Row_MSA, ARGBToUV444Row_MSA functions
      
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C vectorized)
      ARGBToRGB565Row_MSA       - ~1.6x
      ARGBToRGB565Row_Any_MSA   - ~1.6x
      ARGBToARGB1555Row_MSA     - ~1.3x
      ARGBToARGB1555Row_Any_MSA - ~1.3x
      ARGBToARGB4444Row_MSA     - ~3.8x
      ARGBToARGB4444Row_Any_MSA - ~3.8x
      ARGBToUV444Row_MSA        - ~2.4x
      ARGBToUV444Row_Any_MSA    - ~2.4x
      
      Performance Gain (vs C non-vectorized)
      ARGBToRGB565Row_MSA       - ~2.8x
      ARGBToRGB565Row_Any_MSA   - ~2.8x
      ARGBToARGB1555Row_MSA     - ~2.2x
      ARGBToARGB1555Row_Any_MSA - ~2.2x
      ARGBToARGB4444Row_MSA     - ~6.8x
      ARGBToARGB4444Row_Any_MSA - ~6.6x
      ARGBToUV444Row_MSA        - ~6.7x
      ARGBToUV444Row_Any_MSA    - ~6.7x
      
      Review URL: https://codereview.chromium.org/2520003004 .
      da0c29da
  6. 07 Nov, 2016 1 commit
  7. 01 Nov, 2016 1 commit
  8. 26 Oct, 2016 1 commit
  9. 25 Oct, 2016 1 commit
  10. 21 Oct, 2016 1 commit
    • Frank Barchard's avatar
      scale by 1 for neon implemented · 451af5e9
      Frank Barchard authored
      void HalfFloat1Row_NEON(const uint16* src, uint16* dst, float, int width) {
        asm volatile (
        "1:                                          \n"
          MEMACCESS(0)
          "ld1        {v1.16b}, [%0], #16            \n"  // load 8 shorts
          "subs       %w2, %w2, #8                   \n"  // 8 pixels per loop
          "uxtl       v2.4s, v1.4h                   \n"  // 8 int's
          "uxtl2      v1.4s, v1.8h                   \n"
          "scvtf      v2.4s, v2.4s                   \n"  // 8 floats
          "scvtf      v1.4s, v1.4s                   \n"
          "fcvtn      v4.4h, v2.4s                   \n"  // 8 floatsgit
          "fcvtn2     v4.8h, v1.4s                   \n"
         MEMACCESS(1)
          "st1        {v4.16b}, [%1], #16            \n"  // store 8 shorts
          "b.gt       1b                             \n"
        : "+r"(src),    // %0
          "+r"(dst),    // %1
          "+r"(width)   // %2
        :
        : "cc", "memory", "v1", "v2", "v4"
        );
      }
      
      void HalfFloatRow_NEON(const uint16* src, uint16* dst, float scale, int width) {
        asm volatile (
        "1:                                          \n"
          MEMACCESS(0)
          "ld1        {v1.16b}, [%0], #16            \n"  // load 8 shorts
          "subs       %w2, %w2, #8                   \n"  // 8 pixels per loop
          "uxtl       v2.4s, v1.4h                   \n"  // 8 int's
          "uxtl2      v1.4s, v1.8h                   \n"
          "scvtf      v2.4s, v2.4s                   \n"  // 8 floats
          "scvtf      v1.4s, v1.4s                   \n"
          "fmul       v2.4s, v2.4s, %3.s[0]          \n"  // adjust exponent
          "fmul       v1.4s, v1.4s, %3.s[0]          \n"
          "uqshrn     v4.4h, v2.4s, #13              \n"  // isolate halffloat
          "uqshrn2    v4.8h, v1.4s, #13              \n"
         MEMACCESS(1)
          "st1        {v4.16b}, [%1], #16            \n"  // store 8 shorts
          "b.gt       1b                             \n"
        : "+r"(src),    // %0
          "+r"(dst),    // %1
          "+r"(width)   // %2
        : "w"(scale * 1.9259299444e-34f)    // %3
        : "cc", "memory", "v1", "v2", "v4"
        );
      }
      
      TEST=LibYUVPlanarTest.TestHalfFloatPlane_One
      BUG=libyuv:560
      R=hubbe@chromium.org
      
      Review URL: https://codereview.chromium.org/2430313008 .
      451af5e9
  11. 20 Oct, 2016 2 commits
  12. 14 Oct, 2016 1 commit
  13. 13 Oct, 2016 1 commit
  14. 12 Oct, 2016 1 commit
  15. 11 Oct, 2016 2 commits
  16. 30 Sep, 2016 1 commit
  17. 28 Sep, 2016 1 commit
  18. 27 Sep, 2016 1 commit
  19. 16 Sep, 2016 2 commits
  20. 14 Sep, 2016 1 commit
  21. 30 Aug, 2016 1 commit
  22. 25 Aug, 2016 1 commit
  23. 24 Aug, 2016 2 commits
  24. 23 Aug, 2016 1 commit
  25. 22 Aug, 2016 1 commit
  26. 08 Aug, 2016 1 commit
  27. 14 Jul, 2016 1 commit
    • Frank Barchard's avatar
      Remove DISABLE_X86 from build.gn · e74086bf
      Frank Barchard authored
      Fix for duplicate define
      ../../third_party/libyuv/include/libyuv/scale_row.h:29:9: error: 'LIBYUV_DISABLE_X86' macro redefined [-Werror,-Wmacro-redefined]
              ^
      
      GYP version relys on headers disabling the optimization.
      This CL does the same for BUILD.gn
      TBR=kjellander@chromium.org
      BUG=libyuv:625
      
      Review URL: https://codereview.chromium.org/2149823003 .
      e74086bf
  28. 13 Jul, 2016 2 commits
    • Frank Barchard's avatar
      Attribute aligned 32 for YUV conversion structure on Intel · 1aa4ddd2
      Frank Barchard authored
      Fix for unaligned memory exception.
      
      R=braveyao@chromium.org
      BUG=libyuv:616
      
      Review URL: https://codereview.chromium.org/2152553002 .
      1aa4ddd2
    • Frank Barchard's avatar
      Test nv21 layout of Android420ToI420 function. · abcb70f1
      Frank Barchard authored
      to Y,U,V and a pixel stride for U and V.  The pixel stride is expected to be 1 or 2.
      
      [ RUN      ] LibYUVConvertTest.Android420ToI420_1_Any
      [       OK ] LibYUVConvertTest.Android420ToI420_1_Any (253 ms)
      [ RUN      ] LibYUVConvertTest.Android420ToI420_1_Unaligned
      [       OK ] LibYUVConvertTest.Android420ToI420_1_Unaligned (250 ms)
      [ RUN      ] LibYUVConvertTest.Android420ToI420_1_Invert
      [       OK ] LibYUVConvertTest.Android420ToI420_1_Invert (254 ms)
      [ RUN      ] LibYUVConvertTest.Android420ToI420_1_Opt
      [       OK ] LibYUVConvertTest.Android420ToI420_1_Opt (247 ms)
      [ RUN      ] LibYUVConvertTest.Android420ToI420_2_Any
      [       OK ] LibYUVConvertTest.Android420ToI420_2_Any (132 ms)
      [ RUN      ] LibYUVConvertTest.Android420ToI420_2_Unaligned
      [       OK ] LibYUVConvertTest.Android420ToI420_2_Unaligned (122 ms)
      [ RUN      ] LibYUVConvertTest.Android420ToI420_2_Invert
      [       OK ] LibYUVConvertTest.Android420ToI420_2_Invert (124 ms)
      [ RUN      ] LibYUVConvertTest.Android420ToI420_2_Opt
      [       OK ] LibYUVConvertTest.Android420ToI420_2_Opt (119 ms)
      
      TEST=LibYUVConvertTest.Android420ToI420_Opt
      BUG=libyuv:604
      R=braveyao@chromium.org
      
      Review URL: https://codereview.chromium.org/2146733002 .
      abcb70f1
  29. 11 Jul, 2016 1 commit
  30. 08 Jul, 2016 1 commit
  31. 06 Jul, 2016 1 commit
  32. 28 Jun, 2016 1 commit
  33. 24 Jun, 2016 3 commits