• Tomoaki Teshima's avatar
    Merge pull request #9753 from tomoaki0705:universalMatmul · 3cbe60cc
    Tomoaki Teshima authored
    * add accuracy test and performance check for matmul
      * add performance tests for transform and dotProduct
      * add test Core_TransformLargeTest for 8u version of transform
    
    * remove raw SSE2/NEON implementation from matmul.cpp
      * use universal intrinsic instead of raw intrinsic
      * remove unused templated function
      * add v_matmuladd which multiply 3x3 matrix and add 3x1 vector
      * add v_rotate_left/right in universal intrinsic
      * suppress intrinsic on some function and platform
      * add pure SW implementation of new universal intrinsics
      * add test for new universal intrinsics
    
    * core: prevent memory access after the end of buffer
    
    * fix perf tests
    3cbe60cc
matmul.cpp 121 KB