-
Arjan van de Ven authored
This patch adds AVX512 optimized fastConv as well as the hookups needed to get these called in the convolution_layer. AVX512 fastConv is code-identical on a C level to the AVX2 one, but is measurably faster due to AVX512 having more registers available to cache results in. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2938860b
Name |
Last commit
|
Last update |
---|---|---|
.github | ||
3rdparty | ||
apps | ||
cmake | ||
data | ||
doc | ||
include | ||
modules | ||
platforms | ||
samples | ||
.gitattributes | ||
.gitignore | ||
.tgitconfig | ||
CMakeLists.txt | ||
CONTRIBUTING.md | ||
LICENSE | ||
README.md |