1. 14 Dec, 2017 1 commit
    • Tomoaki Teshima's avatar
      core: remove raw SSE2/NEON implementation from convert.cpp (#9831) · ca1a0a11
      Tomoaki Teshima authored
      * remove raw SSE2/NEON implementation from convert.cpp
        * remove raw implementation from Cvt_SIMD
        * remove raw implementation from cvtScale_SIMD
        * remove raw implementation from cvtScaleAbs_SIMD
        * remove duplicated implementation cvt_<float, short>
        * remove duplicated implementation cvtScale_<short, short, float>
        * add "from double" version of Cvt_SIMD
        * modify the condition of test ConvertScaleAbs
      
      * Update convert.cpp
      
      fixed crash in cvtScaleAbs(8s=>8u)
      
      * fixed compile error on Win32
      
      * fixed several test failures because of accuracy loss in cvtScale(int=>int)
      
      * fixed NEON implementation of v_cvt_f64(int=>double) intrinsic
      
      * another attempt to fix test failures
      
      * keep trying to fix the test failures and just introduced compile warnings
      
      * fixed one remaining test (subtractScalar)
      ca1a0a11
  2. 28 Nov, 2017 1 commit
    • Alexander Alekhin's avatar
      ocl: avoid unnecessary loading/initializing OpenCL subsystem · 0ed3209b
      Alexander Alekhin authored
      If there are no OpenCL/UMat methods calls from application.
      
      OpenCL subsystem is initialized:
      - haveOpenCL() is called from application
      - useOpenCL() is called from application
      - access to OpenCL allocator: UMat is created (empty UMat is ignored) or UMat <-> Mat conversions are called
      
      Don't call OpenCL functions if OPENCV_OPENCL_RUNTIME=disabled
      (independent from OpenCL linkage type)
      0ed3209b
  3. 23 Aug, 2017 1 commit
    • Pavel Vlasov's avatar
      ICV2017u3 package update; · a57718e1
      Pavel Vlasov authored
      - Optimizations set change. Now IPP integrations will provide code for SSE42, AVX2 and AVX512 (SKX) CPUs only. For HW below SSE42 IPP code is disabled.
      - Performance regressions fixes for IPP code paths;
      - cv::boxFilter integration improvement;
      - cv::filter2D integration improvement;
      a57718e1
  4. 17 Jul, 2017 1 commit
  5. 04 Jul, 2017 1 commit
  6. 12 Jun, 2017 1 commit
  7. 06 Jun, 2017 1 commit
  8. 23 May, 2017 1 commit
  9. 25 Apr, 2017 1 commit
    • Pavel Vlasov's avatar
      Update for IPP for OpenCV 2017u2 integration; · 11c2ffaf
      Pavel Vlasov authored
      Updated integrations for:
      cv::split
      cv::merge
      cv::insertChannel
      cv::extractChannel
      cv::Mat::convertTo - now with scaled conversions support
      cv::LUT - disabled due to performance issues
      Mat::copyTo
      Mat::setTo
      cv::flip
      cv::copyMakeBorder - currently disabled
      cv::polarToCart
      cv::pow - ipp pow function was removed due to performance issues
      cv::hal::magnitude32f/64f - disabled for <= SSE42, poor performance
      cv::countNonZero
      cv::minMaxIdx
      cv::norm
      cv::canny - new integration. Disabled for threaded;
      cv::cornerHarris
      cv::boxFilter
      cv::bilateralFilter
      cv::integral
      11c2ffaf
  10. 20 Apr, 2017 1 commit
  11. 19 Apr, 2017 1 commit
  12. 06 Apr, 2017 1 commit
  13. 21 Feb, 2017 1 commit
  14. 16 Dec, 2016 2 commits
  15. 14 Dec, 2016 1 commit
  16. 09 Dec, 2016 1 commit
  17. 06 Dec, 2016 1 commit
  18. 02 Dec, 2016 1 commit
  19. 29 Nov, 2016 3 commits
  20. 29 Sep, 2016 1 commit
  21. 23 Sep, 2016 1 commit
    • Tomoaki Teshima's avatar
      check FP16 build condition correctly · c7cb116d
      Tomoaki Teshima authored
        * use __GNUC_MINOR__ in correct place to check the version of GCC
        * check processor support of FP16 at run time
        * check compiler support of FP16 and pass correct compiler option
        * rely on ENABLE_AVX on gcc since AVX is generated when mf16c is passed
        * guard correctly using ifdef in case of various configuration
        * use v_float16x4 correctly by including the right header file
      c7cb116d
  22. 04 Sep, 2016 1 commit
    • Tomoaki Teshima's avatar
      use universal intrinsic for FP16 · 903789f7
      Tomoaki Teshima authored
        * use v_float16x4 (universal intrinsic) instead of raw SSE/NEON implementation
        * define v_load_f16/v_store_f16 since v_load can't be distinguished when short pointer passed
        * brush up implementation on old compiler (guard correctly)
        * add test for v_load_f16 and round trip conversion of v_float16x4
        * fix conversion error
      903789f7
  23. 24 Aug, 2016 1 commit
  24. 19 Aug, 2016 1 commit
  25. 09 Aug, 2016 1 commit
  26. 03 Aug, 2016 1 commit
    • Tomoaki Teshima's avatar
      brush up convertFp16 · 87ca607f
      Tomoaki Teshima authored
        * raise an error when wrong bit depth passed
        * raise an build error when wrong depth is specified for cvtScaleHalf_
        * remove unnecessary safe check in cvtScaleHalf_
        * use intrinsic instead of direct pointer access
        * update the explanation
      87ca607f
  27. 29 Jul, 2016 1 commit
  28. 20 Jul, 2016 1 commit
  29. 08 Jul, 2016 1 commit
  30. 08 Jun, 2016 1 commit
  31. 07 Jun, 2016 2 commits
  32. 06 Jun, 2016 1 commit
  33. 05 Jun, 2016 1 commit
  34. 21 May, 2016 1 commit
    • Tomoaki Teshima's avatar
      add feature to convert FP32(float) to FP16(half) · b2ad7cd9
      Tomoaki Teshima authored
        * check compiler support
        * check HW support before executing
        * add test doing round trip conversion from / to FP32
        * treat array correctly if size is not multiple of 4
        * add declaration to prevent warning
        * make it possible to enable fp16 on 32bit ARM
        * let the conversion possible on non-supported HW, too.
        * add test using both HW and SW implementation
      b2ad7cd9
  35. 25 Dec, 2015 1 commit
  36. 03 Dec, 2015 1 commit
    • Maksim Shabunin's avatar
      HAL: improvements · b4bcdd10
      Maksim Shabunin authored
      - added new functions from core module: split, merge, add, sub, mul, div, ...
      - added function replacement mechanism
      - added example of HAL replacement library
      b4bcdd10