-
Chris Sullivan authored
* straightforward gpu.cos implementation following previous patterns prior to refactor * Generalized unary elementwise gpu op impl.. New unary elementwise ops can be added to the type annotations in gpu_cuda_kernel_ops.hpp. Next step is to refactor the llvm interface in gpu_emitters.hpp for similar generality. * Added gpu_emitter.hpp:EmitUnaryElementwise. Function adds cuda kernel based on ngraph::op::op_type::description. This can service all unary elementwise ops run on the gpu. * The following elementwise unary ops now use the EmitUnaryElementwise emitter: * GPU.abs * GPU.acos * GPU.asin * GPU.atan * GPU.ceiling * GPU.cos * GPU.cosh * GPU.exp * GPU.floor * GPU.log * GPU.not * GPU.sign * GPU.sin * GPU.sinh * GPU.tan * GPU.tanh Unary elementwise ops Sign and Not need extra consideration. * tanh test changed to test::all_close for fp comparison (also done for tan in commit 65fa7c6de34c8277fe2a4801644f6bb64574f4ff). * GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant. * code format update * changed cuda builder interface names to unary/binary/arbitrary, added impl. note to gpu_cuda_kernel_ops, cleaned code format * updated ngraph-cpp reference * Fixing incorrect github conflict resolution. * Added GPU emitter for op::Result. For now it simply copies the output tensor. All but 3 tests now pass. The remaining failing tests are: * GPU.dot_0_0 * GPU.dot_matrix_2x0_0x2 * GPU.dot_2x0_0 * Removed call to handle memory aliasing in gpu_external_function. * fix gpu emitter bug that will return in the middle of function * Merge pull request #609 from NervanaSystems/tfl/fix_return_bug fix gpu emitter bug that will return in the middle of function * GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.
529362b5
Name |
Last commit
|
Last update |
---|---|---|
cmake | ||
contrib/docker | ||
doc | ||
maint | ||
src | ||
test | ||
third-party | ||
.clang-format | ||
.gitignore | ||
CMakeLists.txt | ||
INSTALL | ||
LICENSE | ||
README.md | ||
changes.md |