some further optimizations and cleanups in dnn (#1237)
* some further optimizations and cleanups in dnn: + got rid of dnn::gemm; it's not perf critical anymore (perhaps) + embedded col2im functionality into convolution_layer.cpp, since it's not used anywhere else + parallel max pooling. even better performance can be achieved if we knew that max indices are not needed (and they are not needed in most networks) + somewhat optimized deconvolution layer: optimized bias addition (merged it with col2im), optimized col2im slightly. + hopefully fixed incorrect memory access in fully-connected layer; restored aligned memory reads (they should work fine now) * hopefully fixed regressions in ENet performance * fixed some typos in deconvolution; added SIMD optimization for the max pooling layer * fixed warnings in SIMD-less build configuration
Showing
This diff is collapsed.
Please
register
or
sign in
to comment