• Frank Barchard's avatar
    Gaussian reorder for benefit of A73 · f0a9d6d2
    Frank Barchard authored
    Roughly. instead of 4 loads and 8 multiples, use 1 load and 2 multiples
    4 times over.  The original code, as with the C code from clang and gcc,
    did all the loads, then all the math, then the store.  The new code
    does a load, then the math, then the next load, etc.
    This schedules better on current arm 64 cpus.
    Number of registers also reduced, reusing the same registers.
    
    HiSilicon ARM A73:
    
    Now
    TestGaussRow_Opt (890 ms)
    TestGaussCol_Opt (571 ms)
    
    Was
    TestGaussRow_Opt (1061 ms)
    TestGaussCol_Opt (595 ms)
    
    Qualcomm 821 (Pixel):
    
    Now
    TestGaussRow_Opt (571 ms)
    TestGaussCol_Opt (474 ms)
    
    Was
    TestGaussRow_Opt (751 ms)
    TestGaussCol_Opt (520 ms)
    
    TBR=kjellander@chromium.org
    BUG=libyuv:719
    TEST=LibYUVPlanarTest.TestGaussRow_Opt
    
    Reviewed-on: https://chromium-review.googlesource.com/627478Reviewed-by: 's avatarCheng Wang <wangcheng@google.com>
    Reviewed-by: 's avatarFrank Barchard <fbarchard@google.com>
    Change-Id: I5ec81191d460801f0d4a89f0384f89925ff036de
    Reviewed-on: https://chromium-review.googlesource.com/634448
    Commit-Queue: Frank Barchard <fbarchard@google.com>
    f0a9d6d2
Name
Last commit
Last update
..
compare.cc Loading commit data...
compare_common.cc Loading commit data...
compare_gcc.cc Loading commit data...
compare_neon.cc Loading commit data...
compare_neon64.cc Loading commit data...
compare_win.cc Loading commit data...
convert.cc Loading commit data...
convert_argb.cc Loading commit data...
convert_from.cc Loading commit data...
convert_from_argb.cc Loading commit data...
convert_jpeg.cc Loading commit data...
convert_to_argb.cc Loading commit data...
convert_to_i420.cc Loading commit data...
cpu_id.cc Loading commit data...
mjpeg_decoder.cc Loading commit data...
mjpeg_validate.cc Loading commit data...
planar_functions.cc Loading commit data...
rotate.cc Loading commit data...
rotate_any.cc Loading commit data...
rotate_argb.cc Loading commit data...
rotate_common.cc Loading commit data...
rotate_dspr2.cc Loading commit data...
rotate_gcc.cc Loading commit data...
rotate_msa.cc Loading commit data...
rotate_neon.cc Loading commit data...
rotate_neon64.cc Loading commit data...
rotate_win.cc Loading commit data...
row_any.cc Loading commit data...
row_common.cc Loading commit data...
row_dspr2.cc Loading commit data...
row_gcc.cc Loading commit data...
row_msa.cc Loading commit data...
row_neon.cc Loading commit data...
row_neon64.cc Loading commit data...
row_win.cc Loading commit data...
scale.cc Loading commit data...
scale_any.cc Loading commit data...
scale_argb.cc Loading commit data...
scale_common.cc Loading commit data...
scale_dspr2.cc Loading commit data...
scale_gcc.cc Loading commit data...
scale_msa.cc Loading commit data...
scale_neon.cc Loading commit data...
scale_neon64.cc Loading commit data...
scale_win.cc Loading commit data...
video_common.cc Loading commit data...