- 14 Mar, 2018 3 commits
-
-
Chris Sullivan authored
* Added op::Relu and op::Not and enabled corresponding tests. * Removed softmax for now.
-
Fenglei authored
* add onehot op * refactor broadcast and onehot op
-
Robert Kimball authored
* Add cpio file read/write class and unit tests add reserializer Add unit test for serialize constants to cpio file. Fix bug in serializer if function has no parameters.
-
- 13 Mar, 2018 2 commits
-
-
Chris Sullivan authored
* GPU elementwise emitters now respect input and output tensor types. This enables the use of binary comparison ops and op::Convert. * Removed comments. * All kernels now have type signature even if the i/o tensors are equivalent type so that kernels for specific type tensors are unique. NGMX-391 #close
-
Pruthvi authored
* - Added pattern matcher for bprop sigmoid - mkldnn emitter code for sigmoid bprop - Fusion pass unit test for sigmoid bprop - style fix * Added test case for bprop sigmoid * fixed sigmoid bprop test case failure * fixed bprop unit test values for sigmoid * style fix * fix typo * Addressed PR comments - added layout assignment pass to ensure delta and input have same layout for SigmoidBprop
-
- 11 Mar, 2018 3 commits
-
-
Robert Kimball authored
* fix detailed timing flag * more detailed info
-
Robert Kimball authored
-
Jayaram Bobba authored
-
- 09 Mar, 2018 5 commits
-
-
Chris Sullivan authored
* Refactored unary elementwise ops into a single interface that is adaptable to elementwise ops with arbitrary number of inputs. * Renamed EmitUnaryElementwise -> EmitElementwise. Implemented first binary elementwise op (Power). * Refactored some of the boiler plate code for emitting cuda kernels to nvrtc out of the emit functions and into the CudaFunctionPool static singleton. CodeWriter now saves cuda kernels to ./gpu_codegen. * Added ops Divide, Subtract & Sign to the GPU transformer. Subtract and Sign both use custom device helper functions which have math kernels defined for the op in gpu_cuda_kernel_ops.hpp, and which are built by a new get_device_helper function.
-
Louis Feng authored
-
fenglei.tian authored
-
Nick Korovaiko authored
-
Pruthvi authored
* - Added sigmoid fusion pass - added mkldnn emitter code for sigmoid * - corrected sigmoid expected values - add layout assignment for sigmoid op * - added assert's in cpu fusion for sigmoid - style fix * remove debug prints * NGMX-371 #comment addressed PR comments - Added sigmoid unit test case with 3D input ii) support in cpu_emmiter for sigmoid to handle all input shapes * NGMX-371 #comment use shape_size() to calculate the 1d input size
-
- 08 Mar, 2018 5 commits
-
-
fenglei.tian authored
-
fenglei.tian authored
-
Nick Korovaiko authored
* remove broadcast from matmulbias * fix comments * working gemm-based broadcast * fix clang warning
-
Chris Sullivan authored
* straightforward gpu.cos implementation following previous patterns prior to refactor * Generalized unary elementwise gpu op impl.. New unary elementwise ops can be added to the type annotations in gpu_cuda_kernel_ops.hpp. Next step is to refactor the llvm interface in gpu_emitters.hpp for similar generality. * Added gpu_emitter.hpp:EmitUnaryElementwise. Function adds cuda kernel based on ngraph::op::op_type::description. This can service all unary elementwise ops run on the gpu. * The following elementwise unary ops now use the EmitUnaryElementwise emitter: * GPU.abs * GPU.acos * GPU.asin * GPU.atan * GPU.ceiling * GPU.cos * GPU.cosh * GPU.exp * GPU.floor * GPU.log * GPU.not * GPU.sign * GPU.sin * GPU.sinh * GPU.tan * GPU.tanh Unary elementwise ops Sign and Not need extra consideration. * tanh test changed to test::all_close for fp comparison (also done for tan in commit 65fa7c6de34c8277fe2a4801644f6bb64574f4ff). * GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant. * code format update * changed cuda builder interface names to unary/binary/arbitrary, added impl. note to gpu_cuda_kernel_ops, cleaned code format * updated ngraph-cpp reference * Fixing incorrect github conflict resolution. * Added GPU emitter for op::Result. For now it simply copies the output tensor. All but 3 tests now pass. The remaining failing tests are: * GPU.dot_0_0 * GPU.dot_matrix_2x0_0x2 * GPU.dot_2x0_0 * Removed call to handle memory aliasing in gpu_external_function. * fix gpu emitter bug that will return in the middle of function * Merge pull request #609 from NervanaSystems/tfl/fix_return_bug fix gpu emitter bug that will return in the middle of function * GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.
-
Chris Sullivan authored
* Added GPU emitter for op::Result. For now it simply copies the output tensor. All but 3 tests now pass. The remaining failing tests are: * GPU.dot_0_0 * GPU.dot_matrix_2x0_0x2 * GPU.dot_2x0_0 * Removed call to handle memory aliasing in gpu_external_function. * fix gpu emitter bug that will return in the middle of function * Merge pull request #609 from NervanaSystems/tfl/fix_return_bug fix gpu emitter bug that will return in the middle of function * GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.
-
- 07 Mar, 2018 7 commits
-
-
Pruthvi authored
* - Added support optimized bn mkldnn implementation in cpu emitter - modified bn unit_test to support new implementation - added layout assignment for bn op - Style Fix (cherry picked from commit 7747a40806d62c126059d5c873adcd2e61a0adb0) * modified value initilization in cpu_fusion to be float explicit (cherry picked from commit 03499d380073d0197ab8cbd154eb03f63b042a48) * fix compilation issue * Addressed PR comments - added exception if gamma and beta layout isnot equal to memory::format::x - throw exception if bn Op is not mkldnn op * fix compilation issue * added support to handle multiple o/ps in fprop bn fusion * - Removed laytout pass for bn - fixed autodiff bug in bn - added "Add" for the dispatcher in cpu-layout pass * style fix * Fix bprop batchnorm test with get_output_elements * Style fix
-
Scott Cyphers authored
-
Louis Feng authored
-
Louis Feng authored
-
Louis Feng authored
-
Louis Feng authored
-
Louis Feng authored
-
- 06 Mar, 2018 5 commits
-
-
Jai Menon authored
* CPU: Padded Convolution fusion * CPU: Non-reshaped fusion pattern for zero-padded convolutions * CPU: Refactor consistency checks * CPU: Rewrite hoisted reshape expression and add tests * CPU: Merge leftovers
-
Nick Korovaiko authored
* generalize matmulbias fixes disable logging * unit-test failures
-
Nick Korovaiko authored
* the first stab at op::Result format fixes disabling logging op::Result, 2nd attempt purge stale code disable logging fix copyright header * initial cleanup * cleanup2 * remove dead code * result.cpp, fix comments * fix comment
-
Louis Feng authored
-
Fenglei authored
* add gpu broadcast * add broadcast kernel * fix bug for cumemdopyDtD usage in gpu_external_function.cpp
-
- 05 Mar, 2018 1 commit
-
-
Robert Kimball authored
* cleanup * cleanup * fix all headers to be standalone as far as includes go * include cleanup * cleanup includes * cleanup * include tester * wip * cleanup * cleanup * cleanup
-
- 02 Mar, 2018 6 commits
-
-
adstraw authored
add softmax op and documentation
-
Nick Korovaiko authored
* one output multiple outputs initial clean-up * test clean-up current version test pass * clean up * fix format * add dbeta,dgamma asserts * revert some files * 0644 on node.cpp * 0644 on mkldnn_utils.cpp * 0644 on more files * add support for serialization + test case * fix merge errors
-
Sang Ik Lee authored
* Add aliased Constants to aliased_output test. * add support for const as outputs
-
Robert Kimball authored
* named uniquely * rename get_name methods * add unit test for function name * add node name unit test * make unique name const
-
Robert Kimball authored
-
fenglei.tian authored
-
- 01 Mar, 2018 2 commits
-
-
Robert Kimball authored
* add unit test for resource deallocation fix leak * cleanup
-
Fenglei authored
* fix bug and enable some tests * eliminate duplicated code, change some parameter names
-
- 28 Feb, 2018 1 commit
-
-
Robert Kimball authored
This fixes the previously broken compilation of ngraph-mxnet which directly utilized serializer.hpp
-