modules/dnn/src · a75840d19c6ee1f04ec05e039db1869a4aeadaee · submodule / opencv

Merge pull request #10468 from fenrus75:avx512-2 · a75840d1

Arjan van de Ven authored Jan 31, 2018

* Add a 512 bit codepath to the AVX512 fastConv function

this patch adds a 512 wide codepath to the fastConv() function for
AVX512 use.
The basic idea is to process the first N * 16 elements of the vector
with avx512, and then run the rest of the vector using the traditional
AVX2 codepath.

* dnn: use unaligned AVX512 load (OpenCV aligns data on 32-byte boundary)

* dnn: change "vecsize" condition for AVX512

* dnn: fix indentation

a75840d1

Name	Last commit	Last update
..
caffe		Loading commit data...
darknet		Loading commit data...
layers		Loading commit data...
ocl4dnn		Loading commit data...
opencl		Loading commit data...
tensorflow		Loading commit data...
torch		Loading commit data...
dnn.cpp		Loading commit data...
halide_scheduler.cpp		Loading commit data...
halide_scheduler.hpp		Loading commit data...
init.cpp		Loading commit data...
nms.cpp		Loading commit data...
nms.inl.hpp		Loading commit data...
op_halide.cpp		Loading commit data...
op_halide.hpp		Loading commit data...
precomp.hpp		Loading commit data...