Commits · ecab5430c217d8a26f643a5884352c1deb235669 · submodule / libyuv

14 Dec, 2017 1 commit

Frank Barchard authored 7 years ago

Bug: libyuv:765
Test: build for mips still passes
Change-Id: I99105ad3951d2210c0793e3b9241c178442fdc37
Reviewed-on: https://chromium-review.googlesource.com/826404Reviewed-by: Weiyong Yao <braveyao@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>

3b81288e

18 Aug, 2017 1 commit

Add MSA optimized ScaleFilterCols, ScaleARGBCols, ScaleARGBFilterCols and ScaleRowDown34 functions · 8cd3e4f3

Frank Barchard authored 7 years ago

TBR=kjellander@chromium.org
R=fbarchard@google.com

Bug:libyuv:634
Change-Id: Ib139b9701fc67e24d27a6886377c0cb8b2773fda
Reviewed-on: https://chromium-review.googlesource.com/620791Reviewed-by: Frank Barchard <fbarchard@google.com>

8cd3e4f3

09 Jun, 2017 1 commit

Remove ARM NaCL macros from source · 6c94ad13

Frank Barchard authored 7 years ago

NaCL has been disabled for awhile, so the code
will still build, but only with C versions.
This change removes the MEMACCESS() macros from
Neon and Neon64 source.

BUG=libyuv:702
TEST=try bots build for arm.
R=kjellander@chromium.org

Change-Id: Id581a5c8ff71e18cc69595e7fee9337f97c44a19
Reviewed-on: https://chromium-review.googlesource.com/528332Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>

6c94ad13

06 Jun, 2017 1 commit

ScaleDown odd functions adjust math so last pixel is half width source. · 44abf701

Frank Barchard authored 7 years ago

existing test passes
out/Release/libyuv_unittest --gtest_filter=*Blend* --libyuv_width=33 --libyuv_height=16

new test added
BUG=libyuv:705
TEST=LibYUVScaleTest.TestScaleOdd

Change-Id: Ica91812aee2e4ed9bcc18df4962b089c2e4ae704
Reviewed-on: https://chromium-review.googlesource.com/524932Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>

44abf701

11 Jan, 2017 1 commit

Libyuv MIPS DSPR2 optimizations. · 000d2fa9

Frank Barchard authored 8 years ago

Optimized functions:

I444ToARGBRow_DSPR2
I422ToARGB4444Row_DSPR2
I422ToARGB1555Row_DSPR2
NV12ToARGBRow_DSPR2
BGRAToUVRow_DSPR2
BGRAToYRow_DSPR2
ABGRToUVRow_DSPR2
ARGBToYRow_DSPR2
ABGRToYRow_DSPR2
RGBAToUVRow_DSPR2
RGBAToYRow_DSPR2
ARGBToUVRow_DSPR2
RGB24ToARGBRow_DSPR2
RAWToARGBRow_DSPR2
RGB565ToARGBRow_DSPR2
ARGB1555ToARGBRow_DSPR2
ARGB4444ToARGBRow_DSPR2
ScaleAddRow_DSPR2

Bug-fixes in functions:

ScaleRowDown2_DSPR2
ScaleRowDown4_DSPR2

BUG=

Review-Url: https://codereview.chromium.org/2626123003 .

000d2fa9

21 Dec, 2016 1 commit

Add MSA optimized remaining scale row functions · 288bfbef

Manojkumar Bhosale authored 8 years ago

R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ScaleRowDown2_MSA            - ~22.3x
ScaleRowDown2_Any_MSA        - ~19.9x
ScaleRowDown2Linear_MSA      - ~31.2x
ScaleRowDown2Linear_Any_MSA  - ~29.4x
ScaleRowDown2Box_MSA         - ~20.1x
ScaleRowDown2Box_Any_MSA     - ~19.6x
ScaleRowDown4_MSA            - ~11.7x
ScaleRowDown4_Any_MSA        - ~11.2x
ScaleRowDown4Box_MSA         - ~15.1x
ScaleRowDown4Box_Any_MSA     - ~15.1x
ScaleRowDown38_MSA           - ~1x
ScaleRowDown38_Any_MSA       - ~1x
ScaleRowDown38_2_Box_MSA     - ~1.7x
ScaleRowDown38_2_Box_Any_MSA - ~1.7x
ScaleRowDown38_3_Box_MSA     - ~1.7x
ScaleRowDown38_3_Box_Any_MSA - ~1.7x
ScaleAddRow_MSA              - ~1.2x
ScaleAddRow_Any_MSA          - ~1.15x

Performance Gain (vs C non-vectorized)
ScaleRowDown2_MSA            - ~22.4x
ScaleRowDown2_Any_MSA        - ~19.8x
ScaleRowDown2Linear_MSA      - ~31.6x
ScaleRowDown2Linear_Any_MSA  - ~29.4x
ScaleRowDown2Box_MSA         - ~20.1x
ScaleRowDown2Box_Any_MSA     - ~19.6x
ScaleRowDown4_MSA            - ~11.7x
ScaleRowDown4_Any_MSA        - ~11.2x
ScaleRowDown4Box_MSA         - ~15.1x
ScaleRowDown4Box_Any_MSA     - ~15.1x
ScaleRowDown38_MSA           - ~3.2x
ScaleRowDown38_Any_MSA       - ~3.2x
ScaleRowDown38_2_Box_MSA     - ~2.4x
ScaleRowDown38_2_Box_Any_MSA - ~2.3x
ScaleRowDown38_3_Box_MSA     - ~2.9x
ScaleRowDown38_3_Box_Any_MSA - ~2.8x
ScaleAddRow_MSA              - ~8x
ScaleAddRow_Any_MSA          - ~7.46x

Review-Url: https://codereview.chromium.org/2559683002 .

288bfbef

07 Dec, 2016 1 commit

Add MSA optimized ARGB scaling functions · 56b5bbb0

Manojkumar Bhosale authored 8 years ago

R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
ScaleARGBRowDown2_MSA           - ~2.6x
ScaleARGBRowDown2Linear_MSA     - ~7.9x
ScaleARGBRowDown2Box_MSA        - ~3.7x
ScaleARGBRowDownEven_MSA        - ~1.2x
ScaleARGBRowDownEvenBox_MSA     - ~3.5x

ScaleARGBRowDown2_Any_MSA       - ~2.6x
ScaleARGBRowDown2Linear_Any_MSA - ~7.9x
ScaleARGBRowDown2Box_Any_MSA    - ~3.6x
ScaleARGBRowDownEven_Any_MSA    - ~1.2x
ScaleARGBRowDownEvenBox_Any_MSA - ~3.5x

Performance Gain (vs C non-vectorized)
ScaleARGBRowDown2_MSA           - 2.6x
ScaleARGBRowDown2Linear_MSA     - 13.5x
ScaleARGBRowDown2Box_MSA        - 5.8x
ScaleARGBRowDownEven_MSA        - 1.2x
ScaleARGBRowDownEvenBox_MSA     - 3.7x

ScaleARGBRowDown2_Any_MSA       - 2.6x
ScaleARGBRowDown2Linear_Any_MSA - 13.5x
ScaleARGBRowDown2Box_Any_MSA    - 5.3x
ScaleARGBRowDownEven_Any_MSA    - 1.2x
ScaleARGBRowDownEvenBox_Any_MSA - 3.7x

Review URL: https://codereview.chromium.org/2527983002 .

56b5bbb0

08 Nov, 2016 1 commit

clang-format libyuv · e62309f2

Frank Barchard authored 8 years ago

BUG=libyuv:654
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2469353005 .

e62309f2

06 Jan, 2016 1 commit

Odd width variation of scale down by 2 for subsampling · fc52d8de

Frank Barchard authored 9 years ago

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:538

Review URL: https://codereview.chromium.org/1558093003 .

fc52d8de

16 Dec, 2015 1 commit

change scale down by 4 to use rounding. · 80ca4514

Frank Barchard authored 9 years ago

TBR=harryjin@google.com
BUG=libyuv:447

Review URL: https://codereview.chromium.org/1525033005 .

80ca4514

15 Dec, 2015 1 commit

use rounding in scaledown by 2 · ae55e418

Frank Barchard authored 9 years ago

When scaling down by 2 the formula should round consistently.
(a+b+c+d+2)/4
The C version did but the SSE2 version was doing 2 averages.
avg(avg(a,b),avg(c,d))
This change uses a sum, then rounds.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:447,libyuv:527

Review URL: https://codereview.chromium.org/1513183004 .

ae55e418

09 Jun, 2015 1 commit

Box filter for YUV use rows with accumulation buffer for better memory behavior.… · 05416e2d

fbarchard@google.com authored 9 years ago

Box filter for YUV use rows with accumulation buffer for better memory behavior. The old code would do columns accumulated into registers, and then store the result once. This was slow from a memory point of view. The new code does a row of source at a time, updating an accumulation buffer every row. The accumulation buffer is small, and should fit cache. Before each accumulation of N rows, the buffer needs to be reset to zero. If the memset is a bottleneck, it would be faster to do the first row without an add, storing to the accumulation buffer, and then add for the remaining rows.
BUG=425
TESTED=out\release\libyuv_unittest --gtest_filter=*ScaleTo1x1*
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/52659004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1428 16f28f9a-4ce2-e073-06de-1de4eb20be90

05416e2d

26 May, 2015 1 commit

odd width support for scale by even scale factor and box scale down by 4. scale… · 7c09264f

fbarchard@google.com authored 9 years ago

odd width support for scale by even scale factor and box scale down by 4. scale down by 4 uses scale down by 2 internally.
BUG=431
TESTED=libyuvTest.ARGBScaleDownBy4_Bilinear

Review URL: https://webrtc-codereview.appspot.com/57399004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1412 16f28f9a-4ce2-e073-06de-1de4eb20be90

7c09264f

22 May, 2015 1 commit

scale down by 2 on argb images support odd widths using _any function. · c38aeec3

fbarchard@google.com authored 9 years ago

BUG=431
TESTED=libyuvTest.ARGBScaleDownBy2_Bilinear

Review URL: https://webrtc-codereview.appspot.com/52569004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1410 16f28f9a-4ce2-e073-06de-1de4eb20be90

c38aeec3

30 Apr, 2015 2 commits

scale to 3/4 bug fix for odd widths. multiply to index into source by scale… · 31806d76

fbarchard@google.com authored 9 years ago

scale to 3/4 bug fix for odd widths. multiply to index into source by scale factor should be 4 / 3 not 3 / 4.
BUG=433
TESTED=set LIBYUV_WIDTH=1276 out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*.Scale*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/49219004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1391 16f28f9a-4ce2-e073-06de-1de4eb20be90

31806d76

AVX2 port of ScaleDownBy4. · 9f4636e2

fbarchard@google.com authored 9 years ago

BUG=314
TESTED=out\release\libyuv_unittest --gtest_filter=*.ScaleDownBy4*

Review URL: https://webrtc-codereview.appspot.com/46159004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1390 16f28f9a-4ce2-e073-06de-1de4eb20be90

9f4636e2

27 Apr, 2015 1 commit

scale to 3/4 or 3/8 with odd width destinations efficiently. previously if… · 4e78b8dc

fbarchard@google.com authored 9 years ago

scale to 3/4 or 3/8 with odd width destinations efficiently. previously if width was not multiple of what the simd loop would do (24), scaling would fall back on slower C code. This change allows SIMD to be used for most of the scaling and C for the remainder, improving efficiency.
BUG=314
TESTED=set LIBYUV_WIDTH=1896 & ScaleDownBy3by4_*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/48249004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1380 16f28f9a-4ce2-e073-06de-1de4eb20be90

4e78b8dc

24 Apr, 2015 1 commit

Allow ScaleRowDown any functions to accept non-power of 2 for destination SIMD multiple. · 1ffb04b4

fbarchard@google.com authored 9 years ago

BUG=none
TESTED=local unittests pass
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/45129004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1379 16f28f9a-4ce2-e073-06de-1de4eb20be90

1ffb04b4

22 Apr, 2015 1 commit

ScaleAddRows_Any_SSE2 functions for handling odd widths. · 2b7f6b7d

fbarchard@google.com authored 9 years ago

BUG=425
TESTED=out\release\libyuv_unittest_old --gtest_filter=*.ScaleDownBy3_*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/45219004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1377 16f28f9a-4ce2-e073-06de-1de4eb20be90

2b7f6b7d

07 Apr, 2015 1 commit

Add ScaleARGBFilterCols_NEON for ARM32/64 · 5f609856

yang.zhang@arm.com authored 9 years ago

ARM32/64 NEON versions of ScaleARGBFilterCols_NEON are implemented.

BUG=319
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: Ifea62bc25d846bf16cb51d13b408de7bf58dccd4

Review URL: https://webrtc-codereview.appspot.com/46699004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1361 16f28f9a-4ce2-e073-06de-1de4eb20be90

5f609856

03 Apr, 2015 1 commit

Scale down by 4 for odd number of destination pixels using 'any' that handles… · 44b6ba91

fbarchard@google.com authored 9 years ago

Scale down by 4 for odd number of destination pixels using 'any' that handles SIMD for multiple of 8 pixels, and C for the remainder.
BUG=314
TESTED=local test with width odd

Review URL: https://webrtc-codereview.appspot.com/49599004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1355 16f28f9a-4ce2-e073-06de-1de4eb20be90

44b6ba91

31 Mar, 2015 1 commit

Add ScaleARGBCols_NEON for ARM32/64 · f23d6222

yang.zhang@arm.com authored 9 years ago

ARM32/64 NEON versions of ScaleARGBCols_NEON are implemented.

BUG=319
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: Id9ad97f7aa5d8a34cd55ace9e648cb6ff028efd9

Review URL: https://webrtc-codereview.appspot.com/47689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1351 16f28f9a-4ce2-e073-06de-1de4eb20be90

f23d6222

30 Mar, 2015 1 commit

linear and point sample scale to half size for AVX2. · 72673ac8

fbarchard@google.com authored 9 years ago

BUG=314
TESTED=out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*.ScaleDownBy2*
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/44959004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1349 16f28f9a-4ce2-e073-06de-1de4eb20be90

72673ac8

26 Mar, 2015 1 commit

Scale down by 2 AVX2 port. Processes twice as many pixels as SSE2 and takes… · e6ca9cc2

fbarchard@google.com authored 9 years ago

Scale down by 2 AVX2 port. Processes twice as many pixels as SSE2 and takes advantage of 3 argument instructions to reduce register usage and number of instructions.
BUG=314
TESTED=libyuvTest.ScaleDownBy2_Box
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/42959004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1347 16f28f9a-4ce2-e073-06de-1de4eb20be90

e6ca9cc2

24 Mar, 2015 1 commit

Handle scale down by factor of 2 efficiently by calling SIMD for multiple of 16… · d41fbf40

fbarchard@google.com authored 9 years ago

Handle scale down by factor of 2 efficiently by calling SIMD for multiple of 16 destination pixels, and C for remainder.
BUG=314
TESTED=out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*.ScaleDownBy2*
R=bcornell@google.com

Review URL: https://webrtc-codereview.appspot.com/48689004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1344 16f28f9a-4ce2-e073-06de-1de4eb20be90

d41fbf40

19 Mar, 2015 1 commit

Add ScaleFilterCols_NEON for ARM32/64 · d6d7de57

yang.zhang@arm.com authored 9 years ago

ARM32/64 NEON versions of ScaleFilterCols_NEON are implemented.

BUG=319
TESTED=libyuvTest.* on ARM32/64 with Android
R=fbarchard@google.com

Change-Id: I5b0838769ffb0182155d7cd6bcc520eb81eb5c4e

Review URL: https://webrtc-codereview.appspot.com/41349004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1340 16f28f9a-4ce2-e073-06de-1de4eb20be90

d6d7de57