• Vadim Pisarevsky's avatar
    further improvements in split & merge; started using non-temporary store instructions (#12063) · 43820d89
    Vadim Pisarevsky authored
    * 1. changed static const __m128/256 to const __m128/256 to avoid wierd instructions and calls inserted by compiler.
    2. added universal intrinsics that wrap MOVNTPS and other such (non-temporary or "no cache" store) instructions. v_store_interleave() and v_store() got respective flags/overloaded variants
    3. rewrote split & merge to use the "no cache" store instructions. It resulted in dramatic performance improvement when processing big arrays
    
    * hopefully, fixed some test failures where 4-channel v_store_interleave() is used
    
    * added missing implementation of the new universal intrinsics (v_store_aligned_nocache() etc.)
    
    * fixed silly typo in the new intrinsics in intrin_vsx.hpp
    
    * still trying to fix VSX compiler errors
    
    * still trying to fix VSX compiler errors
    
    * still trying to fix VSX compiler errors
    
    * still trying to fix VSX compiler errors
    43820d89
Name
Last commit
Last update
.github Loading commit data...
3rdparty Loading commit data...
apps Loading commit data...
cmake Loading commit data...
data Loading commit data...
doc Loading commit data...
include Loading commit data...
modules Loading commit data...
platforms Loading commit data...
samples Loading commit data...
.gitattributes Loading commit data...
.gitignore Loading commit data...
CMakeLists.txt Loading commit data...
CONTRIBUTING.md Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...