-
Frank Barchard authored
RAWToJ400 and RGBToJ400 use 2 step row function for Intel. RAWToJ400 Was 3996 ms, now 3309. 20.7% faster. Call a row function for each row, based on ARGBToI400 code. But implement row functions as 2 step conversion. Adds the row functions: RAWToYJ, RGBToYJ, SSSE3 and AVX2 versions, and Any versions. The smaller row buffer is more cache friendly on large images. The max cache size can be configured, and is currently: // Maximum temporary width for wrappers to process at a time, in pixels. And the row buffer is SIMD_ALIGNED(uint8_t row[MAXTWIDTH * 4]); So 8192 bytes are used for the row buffer, leaving the rest for source and destination buffers. blaze-bin/third_party/libyuv/libyuv_test '--gunit_filter=*R*To?400_Opt' --libyuv_width=3600 --libyuv_height=2500 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1 | sortms Was RAWToJ400_Opt (3996 ms) ARGBToI400_Opt (3964 ms) RGB24ToJ400_Opt (3960 ms) ARGBToJ400_Opt (3909 ms) RGBAToJ400_Opt (3885 ms) Now ARGBToJ400_Opt (4091 ms) ARGBToI400_Opt (3936 ms) RGBAToJ400_Opt (3428 ms) RGB24ToJ400_Opt (3324 ms) RAWToJ400_Opt (3309 ms) Bug: libyuv:854, b/147753855 Change-Id: Ieb65fbda94e812c737f4c3c74107354b73c4bcd2 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2016203Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@chromium.org>
3db22ebc