• Vadim Pisarevsky's avatar
    further improvements in split & merge; started using non-temporary store instructions (#12063) · 43820d89
    Vadim Pisarevsky authored
    * 1. changed static const __m128/256 to const __m128/256 to avoid wierd instructions and calls inserted by compiler.
    2. added universal intrinsics that wrap MOVNTPS and other such (non-temporary or "no cache" store) instructions. v_store_interleave() and v_store() got respective flags/overloaded variants
    3. rewrote split & merge to use the "no cache" store instructions. It resulted in dramatic performance improvement when processing big arrays
    
    * hopefully, fixed some test failures where 4-channel v_store_interleave() is used
    
    * added missing implementation of the new universal intrinsics (v_store_aligned_nocache() etc.)
    
    * fixed silly typo in the new intrinsics in intrin_vsx.hpp
    
    * still trying to fix VSX compiler errors
    
    * still trying to fix VSX compiler errors
    
    * still trying to fix VSX compiler errors
    
    * still trying to fix VSX compiler errors
    43820d89
Name
Last commit
Last update
..
core Loading commit data...
core.hpp Loading commit data...