• Arjan van de Ven's avatar
    Merge pull request #10468 from fenrus75:avx512-2 · a75840d1
    Arjan van de Ven authored
    * Add a 512 bit codepath to the AVX512 fastConv function
    
    this patch adds a 512 wide codepath to the fastConv() function for
    AVX512 use.
    The basic idea is to process the first N * 16 elements of the vector
    with avx512, and then run the rest of the vector using the traditional
    AVX2 codepath.
    
    * dnn: use unaligned AVX512 load (OpenCV aligns data on 32-byte boundary)
    
    * dnn: change "vecsize" condition for AVX512
    
    * dnn: fix indentation
    a75840d1
Name
Last commit
Last update
..
calib3d Loading commit data...
core Loading commit data...
cudaarithm Loading commit data...
cudabgsegm Loading commit data...
cudacodec Loading commit data...
cudafeatures2d Loading commit data...
cudafilters Loading commit data...
cudaimgproc Loading commit data...
cudalegacy Loading commit data...
cudaobjdetect Loading commit data...
cudaoptflow Loading commit data...
cudastereo Loading commit data...
cudawarping Loading commit data...
cudev Loading commit data...
dnn Loading commit data...
features2d Loading commit data...
flann Loading commit data...
highgui Loading commit data...
imgcodecs Loading commit data...
imgproc Loading commit data...
java Loading commit data...
js Loading commit data...
ml Loading commit data...
objdetect Loading commit data...
photo Loading commit data...
python Loading commit data...
shape Loading commit data...
stitching Loading commit data...
superres Loading commit data...
ts Loading commit data...
video Loading commit data...
videoio Loading commit data...
videostab Loading commit data...
viz Loading commit data...
world Loading commit data...
CMakeLists.txt Loading commit data...