1. 23 Feb, 2017 1 commit
    • Manojkumar Bhosale's avatar
      Add MSA optimized Interpolate/MergeUV/Misc functions · 45b176d1
      Manojkumar Bhosale authored
      BUG=libyuv:634
      
      Change-Id: If8d60bd57f01fe95bc2fd26196466574195cc126
      
      Performance Gain (vs C auto-vectorized)
      InterpolateRow_MSA      - ~3.3x
      InterpolateRow_Any_MSA  - ~2.5x
      ARGBSetRow_MSA          - ~1.0x
      ARGBSetRow_Any_MSA      - ~1.0x
      ARGBToRGB24Row_MSA      - ~1.9x
      ARGBToRGB24Row_Any_MSA  - ~1.6x
      MergeUVRow_MSA          - ~1.6x
      MergeUVRow_Any_MSA      - ~1.2x
      
      Performance Gain (vs C non-vectorized)
      InterpolateRow_MSA      - ~11.3x
      InterpolateRow_Any_MSA  - ~ 7.9x
      ARGBSetRow_MSA          - ~ 6.2x
      ARGBSetRow_Any_MSA      - ~ 4.0x
      ARGBToRGB24Row_MSA      - ~ 9.9x
      ARGBToRGB24Row_Any_MSA  - ~ 8.4x
      MergeUVRow_MSA          - ~12.7x
      MergeUVRow_Any_MSA      - ~ 8.0x
      
      Change-Id: If8d60bd57f01fe95bc2fd26196466574195cc126
      Reviewed-on: https://chromium-review.googlesource.com/445817Reviewed-by: 's avatarFrank Barchard <fbarchard@google.com>
      Commit-Queue: Frank Barchard <fbarchard@google.com>
      45b176d1
  2. 22 Feb, 2017 1 commit
  3. 21 Feb, 2017 2 commits
  4. 15 Feb, 2017 1 commit
  5. 14 Feb, 2017 4 commits
  6. 11 Feb, 2017 2 commits
  7. 07 Feb, 2017 1 commit
  8. 06 Feb, 2017 3 commits
  9. 04 Feb, 2017 1 commit
  10. 03 Feb, 2017 1 commit
    • Henrik Kjellander's avatar
      Use DEPS for all dependencies + add PRESUBMIT.py · f49fde79
      Henrik Kjellander authored
      This changes libyuv to use the DEPS file for pulling
      down all dependencies (thus no Chromium checkout is needed any more).
      
      Add tools_libyuv directory to contain libyuv-specific tools
      (needed to avoid name collision with the now DEPSed tools/ directory
      of Chromium, which is needed by the toolchain).
      Add tools_libyuv/autoroller/roll_deps.py script to automatically
      roll all entries in the DEPS file (copied from WebRTC).
      
      third_party/ is now DEPSed as well, including the gtest configuration
      headers that used to live inside the libyuv repo.
      
      Add PRESUBMIT.py with a few simple checks + execution of PyLint and
      Python unit tests. For PyLint a pylintrc file was also added.
      
      Valgrind in tools_libyuv/valgrind was updated to make PRESUBMIT.py pass
      and remove old tsan suppressions (not used).
      
      Removed util/android/test_runner.py since it's no longer needed.
      
      Buildbot changes in https://chromium-review.googlesource.com/436464 
      are needed for the Memcheck bot to go green.
      
      BUG=libyuv:676
      NOTRY=True
      
      Change-Id: Ib86fea2905a1656bba2933703ce5a59d29d8db6b
      Reviewed-on: https://chromium-review.googlesource.com/436264
      Commit-Queue: Henrik Kjellander <kjellander@chromium.org>
      Reviewed-by: 's avatarFrank Barchard <fbarchard@google.com>
      f49fde79
  11. 02 Feb, 2017 2 commits
  12. 01 Feb, 2017 2 commits
    • Henrik Kjellander's avatar
      Revert "Roll chromium_revision 941118827f..316b880c55" · 74441e41
      Henrik Kjellander authored
      This reverts commit 03510421.
      Failures on Windows bots are consistent after landing this.
      
      TBR=fbarchard@google.com
      NOTRY=True
      
      Change-Id: Ie249aafde2204297aa2d86ecb1dec6e109685493
      Reviewed-on: https://chromium-review.googlesource.com/435261
      Commit-Queue: Henrik Kjellander <kjellander@chromium.org>
      Reviewed-by: 's avatarHenrik Kjellander <kjellander@chromium.org>
      74441e41
    • Manojkumar Bhosale's avatar
      Add MSA optimized ARGB/ABGR/BGRA/RGBA To Y/UV row functions · 54ce8f23
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C auto-vectorized)
      ARGBToYJRow_MSA       - ~3.2x
      ARGBToYJRow_Any_MSA   - ~2.7x
      BGRAToYRow_MSA        - ~3.2x
      BGRAToYRow_Any_MSA    - ~2.7x
      ABGRToYRow_MSA        - ~3.2x
      ABGRToYRow_Any_MSA    - ~2.6x
      RGBAToYRow_MSA        - ~3.1x
      RGBAToYRow_Any_MSA    - ~2.7x
      ARGBToUVJRow_MSA      - ~5.5x
      ARGBToUVJRow_Any_MSA  - ~4.5x
      BGRAToUVRow_MSA       - ~2.1x
      BGRAToUVRow_Any_MSA   - ~2.0x
      ABGRToUVRow_MSA       - ~2.1x
      ABGRToUVRow_Any_MSA   - ~1.9x
      RGBAToUVRow_MSA       - ~2.2x
      RGBAToUVRow_Any_MSA   - ~1.9x
      
      Performance Gain (vs C non-vectorized)
      ARGBToYJRow_MSA       - ~10.9x
      ARGBToYJRow_Any_MSA   -  ~9.2x
      BGRAToYRow_MSA        - ~10.9x
      BGRAToYRow_Any_MSA    -  ~9.3x
      ABGRToYRow_MSA        - ~11.0x
      ABGRToYRow_Any_MSA    -  ~9.3x
      RGBAToYRow_MSA        - ~10.9x
      RGBAToYRow_Any_MSA    -  ~9.1x
      ARGBToUVJRow_MSA      - ~12.4x
      ARGBToUVJRow_Any_MSA  - ~10.5x
      BGRAToUVRow_MSA       -  ~4.7x
      BGRAToUVRow_Any_MSA   -  ~4.4x
      ABGRToUVRow_MSA       -  ~4.7x
      ABGRToUVRow_Any_MSA   -  ~4.5x
      RGBAToUVRow_MSA       -  ~4.8x
      RGBAToUVRow_Any_MSA   -  ~4.4x
      
      Review-Url: https://codereview.chromium.org/2641153003 .
      54ce8f23
  13. 31 Jan, 2017 1 commit
  14. 30 Jan, 2017 1 commit
  15. 27 Jan, 2017 1 commit
  16. 26 Jan, 2017 2 commits
  17. 24 Jan, 2017 1 commit
  18. 20 Jan, 2017 3 commits
  19. 18 Jan, 2017 1 commit
    • Manojkumar Bhosale's avatar
      Add MSA optimized NV12/21 To RGB row functions · 09b8c971
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C auto-vectorized)
      NV12ToARGBRow_MSA       - ~1.5x
      NV12ToARGBRow_Any_MSA   - ~1.4x
      NV12ToRGB565Row_MSA     - ~1.4x
      NV12ToRGB565Row_Any_MSA - ~1.4x
      NV21ToARGBRow_MSA       - ~1.5x
      NV21ToARGBRow_Any_MSA   - ~1.5x
      SobelRow_MSA            - ~4.3x
      SobelRow_Any_MSA        - ~3.4x
      SobelToPlaneRow_MSA     - ~8.0x
      SobelToPlaneRow_Any_MSA - ~4.7x
      SobelXYRow_MSA          - ~3.0x
      SobelXYRow_Any_MSA      - ~2.5x
      
      Performance Gain (vs C non-vectorized)
      NV12ToARGBRow_MSA       - ~6.5x
      NV12ToARGBRow_Any_MSA   - ~6.5x
      NV12ToRGB565Row_MSA     - ~6.2x
      NV12ToRGB565Row_Any_MSA - ~6.1x
      NV21ToARGBRow_MSA       - ~6.5x
      NV21ToARGBRow_Any_MSA   - ~6.5x
      SobelRow_MSA            - ~14.5x
      SobelRow_Any_MSA        - ~11.3x
      SobelToPlaneRow_MSA     - ~34.2x
      SobelToPlaneRow_Any_MSA - ~19.4x
      SobelXYRow_MSA          - ~11.1x
      SobelXYRow_Any_MSA      - ~9.1x
      
      Review-Url: https://codereview.chromium.org/2636483002 .
      09b8c971
  20. 13 Jan, 2017 3 commits
    • Frank Barchard's avatar
      add Intel Code Analyst markers · a7c87e19
      Frank Barchard authored
      add macros to enable/disable code analyst around blocks of code.
      
      Normally these macros should not be used, but if performance
      details are wanted for intel code, enable them around the code
      and then run via the iaca tool, available on the intel website.
      
      BUG=libyuv:670
      TEST=~/iaca-lin64/bin/iaca.sh -64 out/Release/libyuv_unittest
      R=wangcheng@google.com
      
      Review-Url: https://codereview.chromium.org/2626193002 .
      a7c87e19
    • Manojkumar Bhosale's avatar
      Add MSA optimized rotate functions (used 16x16 transpose) · 73a6f100
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C vectorized)
      TransposeWx16_MSA        - ~6.0x
      TransposeWx16_Any_MSA    - ~4.7x
      TransposeUVWx16_MSA      - ~6.3x
      TransposeUVWx16_Any_MSA  - ~5.4x
      
      Performance Gain (vs C non-vectorized)
      TransposeWx16_MSA        - ~6.0x
      TransposeWx16_Any_MSA    - ~4.8x
      TransposeUVWx16_MSA      - ~6.3x
      TransposeUVWx16_Any_MSA  - ~5.4x
      
      Review-Url: https://codereview.chromium.org/2617703002 .
      73a6f100
    • Manojkumar Bhosale's avatar
      Add MSA optimized RAW/RGB/ARGB to ARGB/Y/UV row functions · 7c64163f
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C vectorized)
      ARGB1555ToARGBRow_MSA     - 1.85
      ARGB1555ToARGBRow_Any_MSA - 1.82
      RGB565ToARGBRow_MSA       - 2.14
      RGB565ToARGBRow_Any_MSA   - 2.08
      RGB24ToARGBRow_MSA        - 8.57
      RGB24ToARGBRow_Any_MSA    - 7.42
      RAWToARGBRow_MSA          - 8.57
      RAWToARGBRow_Any_MSA      - 7.42
      ARGB1555ToYRow_MSA        - 2.60
      ARGB1555ToYRow_Any_MSA    - 2.47
      RGB565ToYRow_MSA          - 2.45
      RGB565ToYRow_Any_MSA      - 2.33
      RGB24ToYRow_MSA           - 2.23
      RGB24ToYRow_Any_MSA       - 2.01
      RAWToYRow_MSA             - 2.25
      RAWToYRow_Any_MSA         - 2.02
      ARGB1555ToUVRow_MSA       - 1.40
      ARGB1555ToUVRow_Any_MSA   - 1.37
      RGB565ToUVRow_MSA         - 1.68
      RGB565ToUVRow_Any_MSA     - 1.63
      RGB24ToUVRow_MSA          - 3.02
      RGB24ToUVRow_Any_MSA      - 2.87
      RAWToUVRow_MSA            - 3.04
      RAWToUVRow_Any_MSA        - 2.85
      
      Performance Gain (vs C non-vectorized)
      ARGB1555ToARGBRow_MSA     - 4.66
      ARGB1555ToARGBRow_Any_MSA - 4.45
      RGB565ToARGBRow_MSA       - 5.58
      RGB565ToARGBRow_Any_MSA   - 5.34
      RGB24ToARGBRow_MSA        - 8.57
      RGB24ToARGBRow_Any_MSA    - 7.42
      RAWToARGBRow_MSA          - 8.57
      RAWToARGBRow_Any_MSA      - 7.42
      ARGB1555ToYRow_MSA        - 6.38
      ARGB1555ToYRow_Any_MSA    - 5.98
      RGB565ToYRow_MSA          - 6.42
      RGB565ToYRow_Any_MSA      - 6.05
      RGB24ToYRow_MSA           - 7.87
      RGB24ToYRow_Any_MSA       - 7.01
      RAWToYRow_MSA             - 7.98
      RAWToYRow_Any_MSA         - 7.01
      ARGB1555ToUVRow_MSA       - 5.39
      ARGB1555ToUVRow_Any_MSA   - 5.06
      RGB565ToUVRow_MSA         - 6.39
      RGB565ToUVRow_Any_MSA     - 5.90
      RGB24ToUVRow_MSA          - 3.04
      RGB24ToUVRow_Any_MSA      - 2.87
      RAWToUVRow_MSA            - 3.04
      RAWToUVRow_Any_MSA        - 2.88
      
      Review-Url: https://codereview.chromium.org/2600713002 .
      7c64163f
  21. 11 Jan, 2017 2 commits
  22. 21 Dec, 2016 1 commit
    • Manojkumar Bhosale's avatar
      Add MSA optimized remaining scale row functions · 288bfbef
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C vectorized)
      ScaleRowDown2_MSA            - ~22.3x
      ScaleRowDown2_Any_MSA        - ~19.9x
      ScaleRowDown2Linear_MSA      - ~31.2x
      ScaleRowDown2Linear_Any_MSA  - ~29.4x
      ScaleRowDown2Box_MSA         - ~20.1x
      ScaleRowDown2Box_Any_MSA     - ~19.6x
      ScaleRowDown4_MSA            - ~11.7x
      ScaleRowDown4_Any_MSA        - ~11.2x
      ScaleRowDown4Box_MSA         - ~15.1x
      ScaleRowDown4Box_Any_MSA     - ~15.1x
      ScaleRowDown38_MSA           - ~1x
      ScaleRowDown38_Any_MSA       - ~1x
      ScaleRowDown38_2_Box_MSA     - ~1.7x
      ScaleRowDown38_2_Box_Any_MSA - ~1.7x
      ScaleRowDown38_3_Box_MSA     - ~1.7x
      ScaleRowDown38_3_Box_Any_MSA - ~1.7x
      ScaleAddRow_MSA              - ~1.2x
      ScaleAddRow_Any_MSA          - ~1.15x
      
      Performance Gain (vs C non-vectorized)
      ScaleRowDown2_MSA            - ~22.4x
      ScaleRowDown2_Any_MSA        - ~19.8x
      ScaleRowDown2Linear_MSA      - ~31.6x
      ScaleRowDown2Linear_Any_MSA  - ~29.4x
      ScaleRowDown2Box_MSA         - ~20.1x
      ScaleRowDown2Box_Any_MSA     - ~19.6x
      ScaleRowDown4_MSA            - ~11.7x
      ScaleRowDown4_Any_MSA        - ~11.2x
      ScaleRowDown4Box_MSA         - ~15.1x
      ScaleRowDown4Box_Any_MSA     - ~15.1x
      ScaleRowDown38_MSA           - ~3.2x
      ScaleRowDown38_Any_MSA       - ~3.2x
      ScaleRowDown38_2_Box_MSA     - ~2.4x
      ScaleRowDown38_2_Box_Any_MSA - ~2.3x
      ScaleRowDown38_3_Box_MSA     - ~2.9x
      ScaleRowDown38_3_Box_Any_MSA - ~2.8x
      ScaleAddRow_MSA              - ~8x
      ScaleAddRow_Any_MSA          - ~7.46x
      
      Review-Url: https://codereview.chromium.org/2559683002 .
      288bfbef
  23. 19 Dec, 2016 1 commit
  24. 15 Dec, 2016 2 commits
    • Manojkumar Bhosale's avatar
      Add MSA optimized ARGB Attenuate/RGB565/Shuffle/Shader/Gray/Sepia row functions · a899dea2
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C vectorized)
      ARGBAttenuateRow_MSA          - ~1.1x
      ARGBAttenuateRow_Any_MSA      - ~1.1x
      ARGBToRGB565DitherRow_MSA     - ~6.4x
      ARGBToRGB565DitherRow_Any_MSA - ~6.2x
      ARGBShuffleRow_MSA            - ~5.1x
      ARGBShuffleRow_Any_MSA        - ~1.9x
      ARGBShadeRow_MSA              - ~1.1x
      ARGBGrayRow_MSA               - ~2.6x
      ARGBSepiaRow_MSA              - ~11.6x
      
      Performance Gain (vs C non-vectorized)
      ARGBAttenuateRow_MSA          - ~2.46x
      ARGBAttenuateRow_Any_MSA      - ~2.45x
      ARGBToRGB565DitherRow_MSA     - ~9.4x
      ARGBToRGB565DitherRow_Any_MSA - ~12.5x
      ARGBShuffleRow_MSA            - ~5.2x
      ARGBShuffleRow_Any_MSA        - ~1.9x
      ARGBShadeRow_MSA              - ~4.3x
      ARGBGrayRow_MSA               - ~10.5x
      ARGBSepiaRow_MSA              - ~12.2x
      
      Review-Url: https://codereview.chromium.org/2559693002 .
      a899dea2
    • Manojkumar Bhosale's avatar
      Add MSA optimized TransposeWx8_MSA and TransposeUVWx8_MSA functions · 6fa5e4eb
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C vectorized)
      TransposeWx8_MSA          - ~2.7x
      TransposeWx8_Any_MSA      - ~2.1x
      TransposeUVWx8_MSA        - ~2.5x
      TransposeUVWx8_Any_MSA    - ~2.7x
      
      Performance Gain (vs C non-vectorized)
      TransposeWx8_MSA          - ~4.6x
      TransposeWx8_Any_MSA      - ~2.9x
      TransposeUVWx8_MSA        - ~4.4x
      TransposeUVWx8_Any_MSA    - ~3.7x
      
      Review URL: https://codereview.chromium.org/2553403002 .
      6fa5e4eb