-
Fenglei authored
* add cuda reduce * clang format * fix bugs * fix bug * add 1d reduce * clang format * fix bugs * unroll loop * remove debug info * revert tests * unroll 1D reduce op * add comments * using cudnn for nd to scalar reduction * remove cuda 1d reduction since cudnn version is faster * remove 1D kernel * fix variable name * resolve Chris's comments * non_reduce_in_strides to non_reduce_strides
6679c233
Name |
Last commit
|
Last update |
---|---|---|
.ci | ||
cmake | ||
contrib/docker | ||
doc | ||
licenses | ||
maint | ||
python | ||
src | ||
test | ||
.clang-format | ||
.gitignore | ||
.gitmodules | ||
.travis.yml | ||
CMakeLists.txt | ||
CONTRIB.md | ||
INSTALL.md | ||
LICENSE | ||
README.md | ||
VERSION.in | ||
changes.md |