-
Vadim Pisarevsky authored
* some further optimizations and cleanups in dnn: + got rid of dnn::gemm; it's not perf critical anymore (perhaps) + embedded col2im functionality into convolution_layer.cpp, since it's not used anywhere else + parallel max pooling. even better performance can be achieved if we knew that max indices are not needed (and they are not needed in most networks) + somewhat optimized deconvolution layer: optimized bias addition (merged it with col2im), optimized col2im slightly. + hopefully fixed incorrect memory access in fully-connected layer; restored aligned memory reads (they should work fine now) * hopefully fixed regressions in ENet performance * fixed some typos in deconvolution; added SIMD optimization for the max pooling layer * fixed warnings in SIMD-less build configuration
b593cae0
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
caffe | ||
layers | ||
opencl | ||
tensorflow | ||
torch | ||
dnn.cpp | ||
halide_scheduler.cpp | ||
halide_scheduler.hpp | ||
init.cpp | ||
op_halide.cpp | ||
op_halide.hpp | ||
precomp.hpp |