-
Arjan van de Ven authored
This patch adds AVX512 optimized fastConv as well as the hookups needed to get these called in the convolution_layer. AVX512 fastConv is code-identical on a C level to the AVX2 one, but is measurably faster due to AVX512 having more registers available to cache results in. Signed-off-by:
Arjan van de Ven <arjan@linux.intel.com>
2938860b
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
caffe | ||
darknet | ||
layers | ||
ocl4dnn | ||
opencl | ||
tensorflow | ||
torch | ||
dnn.cpp | ||
halide_scheduler.cpp | ||
halide_scheduler.hpp | ||
init.cpp | ||
nms.cpp | ||
nms.inl.hpp | ||
op_halide.cpp | ||
op_halide.hpp | ||
precomp.hpp |