1. 10 Oct, 2017 1 commit
    • Vladislav Sovrasov's avatar
      KCF speedup (#1374) · 41995b76
      Vladislav Sovrasov authored
      * kcf use float data type rather than double.
      
      In our practice, float is good enough and could get better performance.
      With this patch, one of my benchmark could get about 20% performance gain.
      Signed-off-by: 's avatarZhigang Gong <zhigang.gong@intel.com>
      
      * Offload transpose matrix multiplication to ocl.
      
      The matrix multiplication in updateProjectMatrix is one of the
      hotspot. And because of the matrix shape is special, say the
      m is very short but the n is very large. The GEMM implementation
      in neither the clBLAS nor the in trunk implementation are very
      inefficient, I implement an standalone transpose matrix mulplication
      kernel here. It can get about 10% performance gain on Intel
      desktop platform or 20% performance gain on a braswell platform.
      And in the mean time, the CPU utilization will be lower.
      Signed-off-by: 's avatarZhigang Gong <zhigang.gong@intel.com>
      
      * Add verification code for kcf ocl transpose mm kernel.
      Signed-off-by: 's avatarZhigang Gong <zhigang.gong@linux.intel.com>
      
      * tracking: show FPS in traker sample
      
      * tracking: fix MSVC warnings in KCF
      
      * tracking: move OCL kernel initialization to constructor in KCF
      41995b76