• Arjan van de Ven's avatar
    Merge pull request #10468 from fenrus75:avx512-2 · a75840d1
    Arjan van de Ven authored
    * Add a 512 bit codepath to the AVX512 fastConv function
    
    this patch adds a 512 wide codepath to the fastConv() function for
    AVX512 use.
    The basic idea is to process the first N * 16 elements of the vector
    with avx512, and then run the rest of the vector using the traditional
    AVX2 codepath.
    
    * dnn: use unaligned AVX512 load (OpenCV aligns data on 32-byte boundary)
    
    * dnn: change "vecsize" condition for AVX512
    
    * dnn: fix indentation
    a75840d1
Name
Last commit
Last update
.github Loading commit data...
3rdparty Loading commit data...
apps Loading commit data...
cmake Loading commit data...
data Loading commit data...
doc Loading commit data...
include Loading commit data...
modules Loading commit data...
platforms Loading commit data...
samples Loading commit data...
.gitattributes Loading commit data...
.gitignore Loading commit data...
.tgitconfig Loading commit data...
CMakeLists.txt Loading commit data...
CONTRIBUTING.md Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...