• Fenglei's avatar
    nvgpu reduction optimization (#1455) · 6679c233
    Fenglei authored
    * add cuda reduce
    
    * clang format
    
    * fix bugs
    
    * fix bug
    
    * add 1d reduce
    
    * clang format
    
    * fix bugs
    
    * unroll loop
    
    * remove debug info
    
    * revert tests
    
    * unroll 1D reduce op
    
    * add comments
    
    * using cudnn for nd to scalar reduction
    
    * remove cuda 1d reduction since cudnn version is faster
    
    * remove 1D kernel
    
    * fix variable name
    
    * resolve Chris's comments
    
    * non_reduce_in_strides to non_reduce_strides
    6679c233
Name
Last commit
Last update
..
ngraph Loading commit data...
resource Loading commit data...
tools Loading commit data...
CMakeLists.txt Loading commit data...