• Fenglei's avatar
    nvgpu reduction optimization (#1455) · 6679c233
    Fenglei authored
    * add cuda reduce
    
    * clang format
    
    * fix bugs
    
    * fix bug
    
    * add 1d reduce
    
    * clang format
    
    * fix bugs
    
    * unroll loop
    
    * remove debug info
    
    * revert tests
    
    * unroll 1D reduce op
    
    * add comments
    
    * using cudnn for nd to scalar reduction
    
    * remove cuda 1d reduction since cudnn version is faster
    
    * remove 1D kernel
    
    * fix variable name
    
    * resolve Chris's comments
    
    * non_reduce_in_strides to non_reduce_strides
    6679c233
Name
Last commit
Last update
.ci Loading commit data...
cmake Loading commit data...
contrib/docker Loading commit data...
doc Loading commit data...
licenses Loading commit data...
maint Loading commit data...
python Loading commit data...
src Loading commit data...
test Loading commit data...
.clang-format Loading commit data...
.gitignore Loading commit data...
.gitmodules Loading commit data...
.travis.yml Loading commit data...
CMakeLists.txt Loading commit data...
CONTRIB.md Loading commit data...
INSTALL.md Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...
VERSION.in Loading commit data...
changes.md Loading commit data...