-
Chris Sullivan authored
* Add GPUKernelArgs for storing kernel arguments. * Formatting. * Resolve tensor addresses when extracting arg list via GPUKernelArgs. * Updated arg list resolution so that placeholder arguments can be added anywhere in the argument list. * const ref. args and changed add_args to use add_arg. also expanded type_names map. * GPUKernelArgs bug fix for return values. * add_placeholders expects pointers for later resolution * Formatting. * Add comments to GPUKernelArgs * Changed GPUKernelArgs interface to use a runtime variable number of arguments. * Removed/updated comment. * Address review comments: Remove combined address resolution and argument list retrieval. Remove unecessary extra type entries in type_map. * Add space between pragma once and includes. * Broadcast optimization (#1322) * Implement GPUKernelArgs with op::Broadcast. * Removed excess type insertion in kernel signature for broadcast impl. * Support new auto kernel signature generation for op::Broadcast. Add boolean to helpers to determine if parameters are registers or arrays. * Removed commented code. * Update broadcast impl. for new GPUKernelArgs interface. * Updated based on interface change to GPUKernelArgs. * Formatting. * CUDNNHostParameters now implement GPUHostParameters. (#1324) * Formatting.
8476dea0
Name |
Last commit
|
Last update |
---|---|---|
.ci/travis/ubuntu | ||
cmake | ||
contrib/docker | ||
doc | ||
licenses | ||
maint | ||
python | ||
src | ||
test | ||
.clang-format | ||
.gitignore | ||
.gitmodules | ||
.travis.yml | ||
CMakeLists.txt | ||
CONTRIB.md | ||
INSTALL.md | ||
LICENSE | ||
README.md | ||
VERSION.in | ||
changes.md |