• Paul Murphy's avatar
    Merge pull request #15257 from pmur:resize · a011035e
    Paul Murphy authored
    * resize: HResizeLinear reduce duplicate work
    
    There appears to be a 2x unroll of the HResizeLinear against k,
    however the k value is only incremented by 1 during the unroll. This
    results in k - 1 duplicate passes when k > 1.
    
    Likewise, the final pass may not respect the work done by the vector
    loop. Start it with the offset returned by the vector op if
    implemented. Note, no vector ops are implemented today.
    
    The performance is most noticable on a linear downscale. A set of
    performance tests are added to characterize this.  The performance
    improvement is 10-50% depending on the scaling.
    
    * imgproc: vectorize HResizeLinear
    
    Performance is mostly gated by the gather operations
    for x inputs.
    
    Likewise, provide a 2x unroll against k, this reduces the
    number of alpha gathers by 1/2 for larger k.
    
    While not a 4x improvement, it still performs substantially
    better under P9 for a 1.4x improvement. P8 baseline is
    1.05-1.10x due to reduced VSX instruction set.
    
    For float types, this results in a more modest
    1.2x improvement.
    
    * Update U8 processing for non-bitexact linear resize
    
    * core: hal: vsx: improve v_load_expand_q
    
    With a little help, we can do this quickly without gprs on
    all VSX enabled targets.
    
    * resize: Fix cn == 3 step per feedback
    
    Per feedback, ensure we don't overrun. This was caught via the
    failure observed in Test_TensorFlow.inception_accuracy.
    a011035e
Name
Last commit
Last update
..
opencl Loading commit data...
perf_accumulate.cpp Loading commit data...
perf_bilateral.cpp Loading commit data...
perf_blur.cpp Loading commit data...
perf_canny.cpp Loading commit data...
perf_contours.cpp Loading commit data...
perf_corners.cpp Loading commit data...
perf_cvt_color.cpp Loading commit data...
perf_distanceTransform.cpp Loading commit data...
perf_filter2d.cpp Loading commit data...
perf_floodfill.cpp Loading commit data...
perf_goodFeaturesToTrack.cpp Loading commit data...
perf_histogram.cpp Loading commit data...
perf_houghcircles.cpp Loading commit data...
perf_houghlines.cpp Loading commit data...
perf_integral.cpp Loading commit data...
perf_main.cpp Loading commit data...
perf_matchTemplate.cpp Loading commit data...
perf_moments.cpp Loading commit data...
perf_morph.cpp Loading commit data...
perf_phasecorr.cpp Loading commit data...
perf_precomp.hpp Loading commit data...
perf_pyramids.cpp Loading commit data...
perf_remap.cpp Loading commit data...
perf_resize.cpp Loading commit data...
perf_sepfilters.cpp Loading commit data...
perf_spatialgradient.cpp Loading commit data...
perf_threshold.cpp Loading commit data...
perf_warp.cpp Loading commit data...