1. 21 Feb, 2017 1 commit
    • Manojkumar Bhosale's avatar
      Add MSA optimized I444/I400/J400/YUY2/UYVY to ARGB row functions · eed66b20
      Manojkumar Bhosale authored
      BUG=libyuv:634
      
      Change-Id: Ida80027c36a938a3bcf6f4480626f8eb9495e1be
      
      Performance Gain (vs C auto-vectorized)
      I444ToARGBRow_MSA       - ~1.6x
      I444ToARGBRow_Any_MSA   - ~1.6x
      I400ToARGBRow_MSA       - ~5.5x
      I400ToARGBRow_Any_MSA   - ~5.3x
      J400ToARGBRow_MSA       - ~1.0x
      J400ToARGBRow_Any_MSA   - ~1.0x
      YUY2ToARGBRow_MSA       - ~1.6x
      YUY2ToARGBRow_Any_MSA   - ~1.6x
      UYVYToARGBRow_MSA       - ~1.6x
      UYVYToARGBRow_Any_MSA   - ~1.6x
      
      Performance Gain (vs C non-vectorized)
      I444ToARGBRow_MSA       - ~7.3x
      I444ToARGBRow_Any_MSA   - ~7.1x
      I400ToARGBRow_MSA       - ~5.5x
      I400ToARGBRow_Any_MSA   - ~5.2x
      J400ToARGBRow_MSA       - ~6.8x
      J400ToARGBRow_Any_MSA   - ~5.7x
      YUY2ToARGBRow_MSA       - ~7.2x
      YUY2ToARGBRow_Any_MSA   - ~7.0x
      UYVYToARGBRow_MSA       - ~7.1x
      UYVYToARGBRow_Any_MSA   - ~6.9x
      
      Change-Id: Ida80027c36a938a3bcf6f4480626f8eb9495e1be
      Reviewed-on: https://chromium-review.googlesource.com/439246Reviewed-by: 's avatarFrank Barchard <fbarchard@google.com>
      Commit-Queue: Frank Barchard <fbarchard@google.com>
      eed66b20
  2. 15 Feb, 2017 1 commit
  3. 14 Feb, 2017 4 commits
  4. 11 Feb, 2017 2 commits
  5. 07 Feb, 2017 1 commit
  6. 06 Feb, 2017 3 commits
  7. 04 Feb, 2017 1 commit
  8. 03 Feb, 2017 1 commit
    • Henrik Kjellander's avatar
      Use DEPS for all dependencies + add PRESUBMIT.py · f49fde79
      Henrik Kjellander authored
      This changes libyuv to use the DEPS file for pulling
      down all dependencies (thus no Chromium checkout is needed any more).
      
      Add tools_libyuv directory to contain libyuv-specific tools
      (needed to avoid name collision with the now DEPSed tools/ directory
      of Chromium, which is needed by the toolchain).
      Add tools_libyuv/autoroller/roll_deps.py script to automatically
      roll all entries in the DEPS file (copied from WebRTC).
      
      third_party/ is now DEPSed as well, including the gtest configuration
      headers that used to live inside the libyuv repo.
      
      Add PRESUBMIT.py with a few simple checks + execution of PyLint and
      Python unit tests. For PyLint a pylintrc file was also added.
      
      Valgrind in tools_libyuv/valgrind was updated to make PRESUBMIT.py pass
      and remove old tsan suppressions (not used).
      
      Removed util/android/test_runner.py since it's no longer needed.
      
      Buildbot changes in https://chromium-review.googlesource.com/436464 
      are needed for the Memcheck bot to go green.
      
      BUG=libyuv:676
      NOTRY=True
      
      Change-Id: Ib86fea2905a1656bba2933703ce5a59d29d8db6b
      Reviewed-on: https://chromium-review.googlesource.com/436264
      Commit-Queue: Henrik Kjellander <kjellander@chromium.org>
      Reviewed-by: 's avatarFrank Barchard <fbarchard@google.com>
      f49fde79
  9. 02 Feb, 2017 2 commits
  10. 01 Feb, 2017 2 commits
    • Henrik Kjellander's avatar
      Revert "Roll chromium_revision 941118827f..316b880c55" · 74441e41
      Henrik Kjellander authored
      This reverts commit 03510421.
      Failures on Windows bots are consistent after landing this.
      
      TBR=fbarchard@google.com
      NOTRY=True
      
      Change-Id: Ie249aafde2204297aa2d86ecb1dec6e109685493
      Reviewed-on: https://chromium-review.googlesource.com/435261
      Commit-Queue: Henrik Kjellander <kjellander@chromium.org>
      Reviewed-by: 's avatarHenrik Kjellander <kjellander@chromium.org>
      74441e41
    • Manojkumar Bhosale's avatar
      Add MSA optimized ARGB/ABGR/BGRA/RGBA To Y/UV row functions · 54ce8f23
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C auto-vectorized)
      ARGBToYJRow_MSA       - ~3.2x
      ARGBToYJRow_Any_MSA   - ~2.7x
      BGRAToYRow_MSA        - ~3.2x
      BGRAToYRow_Any_MSA    - ~2.7x
      ABGRToYRow_MSA        - ~3.2x
      ABGRToYRow_Any_MSA    - ~2.6x
      RGBAToYRow_MSA        - ~3.1x
      RGBAToYRow_Any_MSA    - ~2.7x
      ARGBToUVJRow_MSA      - ~5.5x
      ARGBToUVJRow_Any_MSA  - ~4.5x
      BGRAToUVRow_MSA       - ~2.1x
      BGRAToUVRow_Any_MSA   - ~2.0x
      ABGRToUVRow_MSA       - ~2.1x
      ABGRToUVRow_Any_MSA   - ~1.9x
      RGBAToUVRow_MSA       - ~2.2x
      RGBAToUVRow_Any_MSA   - ~1.9x
      
      Performance Gain (vs C non-vectorized)
      ARGBToYJRow_MSA       - ~10.9x
      ARGBToYJRow_Any_MSA   -  ~9.2x
      BGRAToYRow_MSA        - ~10.9x
      BGRAToYRow_Any_MSA    -  ~9.3x
      ABGRToYRow_MSA        - ~11.0x
      ABGRToYRow_Any_MSA    -  ~9.3x
      RGBAToYRow_MSA        - ~10.9x
      RGBAToYRow_Any_MSA    -  ~9.1x
      ARGBToUVJRow_MSA      - ~12.4x
      ARGBToUVJRow_Any_MSA  - ~10.5x
      BGRAToUVRow_MSA       -  ~4.7x
      BGRAToUVRow_Any_MSA   -  ~4.4x
      ABGRToUVRow_MSA       -  ~4.7x
      ABGRToUVRow_Any_MSA   -  ~4.5x
      RGBAToUVRow_MSA       -  ~4.8x
      RGBAToUVRow_Any_MSA   -  ~4.4x
      
      Review-Url: https://codereview.chromium.org/2641153003 .
      54ce8f23
  11. 31 Jan, 2017 1 commit
  12. 30 Jan, 2017 1 commit
  13. 27 Jan, 2017 1 commit
  14. 26 Jan, 2017 2 commits
  15. 24 Jan, 2017 1 commit
  16. 20 Jan, 2017 3 commits
  17. 18 Jan, 2017 1 commit
    • Manojkumar Bhosale's avatar
      Add MSA optimized NV12/21 To RGB row functions · 09b8c971
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C auto-vectorized)
      NV12ToARGBRow_MSA       - ~1.5x
      NV12ToARGBRow_Any_MSA   - ~1.4x
      NV12ToRGB565Row_MSA     - ~1.4x
      NV12ToRGB565Row_Any_MSA - ~1.4x
      NV21ToARGBRow_MSA       - ~1.5x
      NV21ToARGBRow_Any_MSA   - ~1.5x
      SobelRow_MSA            - ~4.3x
      SobelRow_Any_MSA        - ~3.4x
      SobelToPlaneRow_MSA     - ~8.0x
      SobelToPlaneRow_Any_MSA - ~4.7x
      SobelXYRow_MSA          - ~3.0x
      SobelXYRow_Any_MSA      - ~2.5x
      
      Performance Gain (vs C non-vectorized)
      NV12ToARGBRow_MSA       - ~6.5x
      NV12ToARGBRow_Any_MSA   - ~6.5x
      NV12ToRGB565Row_MSA     - ~6.2x
      NV12ToRGB565Row_Any_MSA - ~6.1x
      NV21ToARGBRow_MSA       - ~6.5x
      NV21ToARGBRow_Any_MSA   - ~6.5x
      SobelRow_MSA            - ~14.5x
      SobelRow_Any_MSA        - ~11.3x
      SobelToPlaneRow_MSA     - ~34.2x
      SobelToPlaneRow_Any_MSA - ~19.4x
      SobelXYRow_MSA          - ~11.1x
      SobelXYRow_Any_MSA      - ~9.1x
      
      Review-Url: https://codereview.chromium.org/2636483002 .
      09b8c971
  18. 13 Jan, 2017 3 commits
    • Frank Barchard's avatar
      add Intel Code Analyst markers · a7c87e19
      Frank Barchard authored
      add macros to enable/disable code analyst around blocks of code.
      
      Normally these macros should not be used, but if performance
      details are wanted for intel code, enable them around the code
      and then run via the iaca tool, available on the intel website.
      
      BUG=libyuv:670
      TEST=~/iaca-lin64/bin/iaca.sh -64 out/Release/libyuv_unittest
      R=wangcheng@google.com
      
      Review-Url: https://codereview.chromium.org/2626193002 .
      a7c87e19
    • Manojkumar Bhosale's avatar
      Add MSA optimized rotate functions (used 16x16 transpose) · 73a6f100
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C vectorized)
      TransposeWx16_MSA        - ~6.0x
      TransposeWx16_Any_MSA    - ~4.7x
      TransposeUVWx16_MSA      - ~6.3x
      TransposeUVWx16_Any_MSA  - ~5.4x
      
      Performance Gain (vs C non-vectorized)
      TransposeWx16_MSA        - ~6.0x
      TransposeWx16_Any_MSA    - ~4.8x
      TransposeUVWx16_MSA      - ~6.3x
      TransposeUVWx16_Any_MSA  - ~5.4x
      
      Review-Url: https://codereview.chromium.org/2617703002 .
      73a6f100
    • Manojkumar Bhosale's avatar
      Add MSA optimized RAW/RGB/ARGB to ARGB/Y/UV row functions · 7c64163f
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C vectorized)
      ARGB1555ToARGBRow_MSA     - 1.85
      ARGB1555ToARGBRow_Any_MSA - 1.82
      RGB565ToARGBRow_MSA       - 2.14
      RGB565ToARGBRow_Any_MSA   - 2.08
      RGB24ToARGBRow_MSA        - 8.57
      RGB24ToARGBRow_Any_MSA    - 7.42
      RAWToARGBRow_MSA          - 8.57
      RAWToARGBRow_Any_MSA      - 7.42
      ARGB1555ToYRow_MSA        - 2.60
      ARGB1555ToYRow_Any_MSA    - 2.47
      RGB565ToYRow_MSA          - 2.45
      RGB565ToYRow_Any_MSA      - 2.33
      RGB24ToYRow_MSA           - 2.23
      RGB24ToYRow_Any_MSA       - 2.01
      RAWToYRow_MSA             - 2.25
      RAWToYRow_Any_MSA         - 2.02
      ARGB1555ToUVRow_MSA       - 1.40
      ARGB1555ToUVRow_Any_MSA   - 1.37
      RGB565ToUVRow_MSA         - 1.68
      RGB565ToUVRow_Any_MSA     - 1.63
      RGB24ToUVRow_MSA          - 3.02
      RGB24ToUVRow_Any_MSA      - 2.87
      RAWToUVRow_MSA            - 3.04
      RAWToUVRow_Any_MSA        - 2.85
      
      Performance Gain (vs C non-vectorized)
      ARGB1555ToARGBRow_MSA     - 4.66
      ARGB1555ToARGBRow_Any_MSA - 4.45
      RGB565ToARGBRow_MSA       - 5.58
      RGB565ToARGBRow_Any_MSA   - 5.34
      RGB24ToARGBRow_MSA        - 8.57
      RGB24ToARGBRow_Any_MSA    - 7.42
      RAWToARGBRow_MSA          - 8.57
      RAWToARGBRow_Any_MSA      - 7.42
      ARGB1555ToYRow_MSA        - 6.38
      ARGB1555ToYRow_Any_MSA    - 5.98
      RGB565ToYRow_MSA          - 6.42
      RGB565ToYRow_Any_MSA      - 6.05
      RGB24ToYRow_MSA           - 7.87
      RGB24ToYRow_Any_MSA       - 7.01
      RAWToYRow_MSA             - 7.98
      RAWToYRow_Any_MSA         - 7.01
      ARGB1555ToUVRow_MSA       - 5.39
      ARGB1555ToUVRow_Any_MSA   - 5.06
      RGB565ToUVRow_MSA         - 6.39
      RGB565ToUVRow_Any_MSA     - 5.90
      RGB24ToUVRow_MSA          - 3.04
      RGB24ToUVRow_Any_MSA      - 2.87
      RAWToUVRow_MSA            - 3.04
      RAWToUVRow_Any_MSA        - 2.88
      
      Review-Url: https://codereview.chromium.org/2600713002 .
      7c64163f
  19. 11 Jan, 2017 2 commits
  20. 21 Dec, 2016 1 commit
    • Manojkumar Bhosale's avatar
      Add MSA optimized remaining scale row functions · 288bfbef
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C vectorized)
      ScaleRowDown2_MSA            - ~22.3x
      ScaleRowDown2_Any_MSA        - ~19.9x
      ScaleRowDown2Linear_MSA      - ~31.2x
      ScaleRowDown2Linear_Any_MSA  - ~29.4x
      ScaleRowDown2Box_MSA         - ~20.1x
      ScaleRowDown2Box_Any_MSA     - ~19.6x
      ScaleRowDown4_MSA            - ~11.7x
      ScaleRowDown4_Any_MSA        - ~11.2x
      ScaleRowDown4Box_MSA         - ~15.1x
      ScaleRowDown4Box_Any_MSA     - ~15.1x
      ScaleRowDown38_MSA           - ~1x
      ScaleRowDown38_Any_MSA       - ~1x
      ScaleRowDown38_2_Box_MSA     - ~1.7x
      ScaleRowDown38_2_Box_Any_MSA - ~1.7x
      ScaleRowDown38_3_Box_MSA     - ~1.7x
      ScaleRowDown38_3_Box_Any_MSA - ~1.7x
      ScaleAddRow_MSA              - ~1.2x
      ScaleAddRow_Any_MSA          - ~1.15x
      
      Performance Gain (vs C non-vectorized)
      ScaleRowDown2_MSA            - ~22.4x
      ScaleRowDown2_Any_MSA        - ~19.8x
      ScaleRowDown2Linear_MSA      - ~31.6x
      ScaleRowDown2Linear_Any_MSA  - ~29.4x
      ScaleRowDown2Box_MSA         - ~20.1x
      ScaleRowDown2Box_Any_MSA     - ~19.6x
      ScaleRowDown4_MSA            - ~11.7x
      ScaleRowDown4_Any_MSA        - ~11.2x
      ScaleRowDown4Box_MSA         - ~15.1x
      ScaleRowDown4Box_Any_MSA     - ~15.1x
      ScaleRowDown38_MSA           - ~3.2x
      ScaleRowDown38_Any_MSA       - ~3.2x
      ScaleRowDown38_2_Box_MSA     - ~2.4x
      ScaleRowDown38_2_Box_Any_MSA - ~2.3x
      ScaleRowDown38_3_Box_MSA     - ~2.9x
      ScaleRowDown38_3_Box_Any_MSA - ~2.8x
      ScaleAddRow_MSA              - ~8x
      ScaleAddRow_Any_MSA          - ~7.46x
      
      Review-Url: https://codereview.chromium.org/2559683002 .
      288bfbef
  21. 19 Dec, 2016 1 commit
  22. 15 Dec, 2016 2 commits
    • Manojkumar Bhosale's avatar
      Add MSA optimized ARGB Attenuate/RGB565/Shuffle/Shader/Gray/Sepia row functions · a899dea2
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C vectorized)
      ARGBAttenuateRow_MSA          - ~1.1x
      ARGBAttenuateRow_Any_MSA      - ~1.1x
      ARGBToRGB565DitherRow_MSA     - ~6.4x
      ARGBToRGB565DitherRow_Any_MSA - ~6.2x
      ARGBShuffleRow_MSA            - ~5.1x
      ARGBShuffleRow_Any_MSA        - ~1.9x
      ARGBShadeRow_MSA              - ~1.1x
      ARGBGrayRow_MSA               - ~2.6x
      ARGBSepiaRow_MSA              - ~11.6x
      
      Performance Gain (vs C non-vectorized)
      ARGBAttenuateRow_MSA          - ~2.46x
      ARGBAttenuateRow_Any_MSA      - ~2.45x
      ARGBToRGB565DitherRow_MSA     - ~9.4x
      ARGBToRGB565DitherRow_Any_MSA - ~12.5x
      ARGBShuffleRow_MSA            - ~5.2x
      ARGBShuffleRow_Any_MSA        - ~1.9x
      ARGBShadeRow_MSA              - ~4.3x
      ARGBGrayRow_MSA               - ~10.5x
      ARGBSepiaRow_MSA              - ~12.2x
      
      Review-Url: https://codereview.chromium.org/2559693002 .
      a899dea2
    • Manojkumar Bhosale's avatar
      Add MSA optimized TransposeWx8_MSA and TransposeUVWx8_MSA functions · 6fa5e4eb
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C vectorized)
      TransposeWx8_MSA          - ~2.7x
      TransposeWx8_Any_MSA      - ~2.1x
      TransposeUVWx8_MSA        - ~2.5x
      TransposeUVWx8_Any_MSA    - ~2.7x
      
      Performance Gain (vs C non-vectorized)
      TransposeWx8_MSA          - ~4.6x
      TransposeWx8_Any_MSA      - ~2.9x
      TransposeUVWx8_MSA        - ~4.4x
      TransposeUVWx8_Any_MSA    - ~3.7x
      
      Review URL: https://codereview.chromium.org/2553403002 .
      6fa5e4eb
  23. 14 Dec, 2016 1 commit
  24. 07 Dec, 2016 2 commits
    • Frank Barchard's avatar
      ConvertFromI420: use halfstride instead of halfwidth · dde8ba70
      Frank Barchard authored
      BUG=libyuv:660
      TEST=try bots
      R=kjellander@chromium.org
      
      Review URL: https://codereview.chromium.org/2554213003 .
      dde8ba70
    • Manojkumar Bhosale's avatar
      Add MSA optimized ARGB scaling functions · 56b5bbb0
      Manojkumar Bhosale authored
      R=fbarchard@google.com
      BUG=libyuv:634
      
      Performance Gain (vs C vectorized)
      ScaleARGBRowDown2_MSA           - ~2.6x
      ScaleARGBRowDown2Linear_MSA     - ~7.9x
      ScaleARGBRowDown2Box_MSA        - ~3.7x
      ScaleARGBRowDownEven_MSA        - ~1.2x
      ScaleARGBRowDownEvenBox_MSA     - ~3.5x
      
      ScaleARGBRowDown2_Any_MSA       - ~2.6x
      ScaleARGBRowDown2Linear_Any_MSA - ~7.9x
      ScaleARGBRowDown2Box_Any_MSA    - ~3.6x
      ScaleARGBRowDownEven_Any_MSA    - ~1.2x
      ScaleARGBRowDownEvenBox_Any_MSA - ~3.5x
      
      Performance Gain (vs C non-vectorized)
      ScaleARGBRowDown2_MSA           - 2.6x
      ScaleARGBRowDown2Linear_MSA     - 13.5x
      ScaleARGBRowDown2Box_MSA        - 5.8x
      ScaleARGBRowDownEven_MSA        - 1.2x
      ScaleARGBRowDownEvenBox_MSA     - 3.7x
      
      ScaleARGBRowDown2_Any_MSA       - 2.6x
      ScaleARGBRowDown2Linear_Any_MSA - 13.5x
      ScaleARGBRowDown2Box_Any_MSA    - 5.3x
      ScaleARGBRowDownEven_Any_MSA    - 1.2x
      ScaleARGBRowDownEvenBox_Any_MSA - 3.7x
      
      Review URL: https://codereview.chromium.org/2527983002 .
      56b5bbb0