• insoow's avatar
    Merge pull request #8104 from insoow:master · 2922738b
    insoow authored
    Gemm kernels for Intel GPU (#8104)
    
    * Fix an issue with Kernel object reset release when consecutive Kernel::run calls
    
    Kernel::run launch OCL gpu kernels and set a event callback function
    to decreate the ref count of UMat or remove UMat when the lauched workloads
    are completed. However, for some OCL kernels requires multiple call of
    Kernel::run function with some kernel parameter changes (e.g., input
    and output buffer offset) to get the final computation result.
    In the case, the current implementation requires unnecessary
    synchronization and cleanupMat.
    
    This fix requires the user to specify whether there will be more work or not.
    If there is no remaining computation, the Kernel::run will reset the
    kernel object
    Signed-off-by: 's avatarWoo, Insoo <insoo.woo@intel.com>
    
    * GEMM kernel optimization for Intel GEN
    
    The optimized kernels uses cl_intel_subgroups extension for better
    performance.
    
    Note: This optimized kernels will be part of ISAAC in a code generation
    way under MIT license.
    Signed-off-by: 's avatarWoo, Insoo <insoo.woo@intel.com>
    
    * Fix API compatibility error
    
    This patch fixes a OCV API compatibility error. The error was reported
    due to the interface changes of Kernel::run. To resolve the issue,
    An overloaded function of Kernel::run is added. It take a flag indicating
    whether there are more work to be done with the kernel object without
    releasing resources related to it.
    Signed-off-by: 's avatarWoo, Insoo <insoo.woo@intel.com>
    
    * Renaming intel_gpu_gemm.cpp to intel_gpu_gemm.inl.hpp
    Signed-off-by: 's avatarWoo, Insoo <insoo.woo@intel.com>
    
    * Revert "Fix API compatibility error"
    
    This reverts commit 2ef427db91b6c4aec170f691c5d2e6c47d6520d7.
    
    Conflicts:
    	modules/core/src/intel_gpu_gemm.inl.hpp
    
    * Revert "Fix an issue with Kernel object reset release when consecutive Kernel::run calls"
    
    This reverts commit cc7f9f54695dc293598addce9b9d7e345225bede.
    
    * Fix the case of uninitialization D
    
    When C is null and beta is non-zero, D is used without initialization.
    This resloves the issue
    Signed-off-by: 's avatarWoo, Insoo <insoo.woo@intel.com>
    
    * fix potential output error due to 0 * nan
    Signed-off-by: 's avatarWoo, Insoo <insoo.woo@intel.com>
    
    * whitespace fix, eliminate non-ASCII symbols
    
    * fix build warning
    2922738b
Name
Last commit
Last update
..
calib3d Loading commit data...
core Loading commit data...
cudaarithm Loading commit data...
cudabgsegm Loading commit data...
cudacodec Loading commit data...
cudafeatures2d Loading commit data...
cudafilters Loading commit data...
cudaimgproc Loading commit data...
cudalegacy Loading commit data...
cudaobjdetect Loading commit data...
cudaoptflow Loading commit data...
cudastereo Loading commit data...
cudawarping Loading commit data...
cudev Loading commit data...
features2d Loading commit data...
flann Loading commit data...
highgui Loading commit data...
imgcodecs Loading commit data...
imgproc Loading commit data...
java Loading commit data...
ml Loading commit data...
objdetect Loading commit data...
photo Loading commit data...
python Loading commit data...
shape Loading commit data...
stitching Loading commit data...
superres Loading commit data...
ts Loading commit data...
video Loading commit data...
videoio Loading commit data...
videostab Loading commit data...
viz Loading commit data...
world Loading commit data...
CMakeLists.txt Loading commit data...