-
Arjan van de Ven authored
* Add a 512 bit codepath to the AVX512 fastConv function this patch adds a 512 wide codepath to the fastConv() function for AVX512 use. The basic idea is to process the first N * 16 elements of the vector with avx512, and then run the rest of the vector using the traditional AVX2 codepath. * dnn: use unaligned AVX512 load (OpenCV aligns data on 32-byte boundary) * dnn: change "vecsize" condition for AVX512 * dnn: fix indentation
a75840d1
Name |
Last commit
|
Last update |
---|---|---|
.github | ||
3rdparty | ||
apps | ||
cmake | ||
data | ||
doc | ||
include | ||
modules | ||
platforms | ||
samples | ||
.gitattributes | ||
.gitignore | ||
.tgitconfig | ||
CMakeLists.txt | ||
CONTRIBUTING.md | ||
LICENSE | ||
README.md |