- 22 Mar, 2018 7 commits
-
-
Nick Korovaiko authored
* make matmulbias callback aware that addition is commutative
-
Adam Procter authored
-
Nick Korovaiko authored
* make sure deserializer doesn't add op::Result twice
-
Pruthvi authored
* Added new ctor for bn which supports Inference - added mkldnn emitter code for bn inference * Added test case for bn inference - added support for layout propogation for bn inference * added sanity checks for gamma, beta, mean, variance shape in bn * added serializer support for bn inference
-
Fenglei authored
* general dot for gpu
-
Chris Sullivan authored
* Current cudnn implementations use only a single dimension for the ngraph tensor data (width). In this case the tensor format should be set to CUDNN_TENSOR_NCHW so that adjacent memory accesses are coalesced (stride=1 for width). * * Added some kernel emitter helpers that are reused often. * Renamed EmitElementwise -> emit_elementwise to match emit<T>. * op::Sum now handles trivial case of dim(input_tensor) = dim(output_tensor) by performing a memcpy as no axes are reduced. * Added general case for Nd descriptors which is used when the tensor has more than 4 dimensions. Currently a naive reduce is performed, in the future a coordinate transformation could be performed to improve the memory layout for the reduction. * Switched to codegen::CodeWriter::block_begin/end. It appears that CodeWriter::block_begin/end is not frequently used for emitters (in cpu and gpu transformers) because a block comment is often desired. To this end I added prefix/suffix default parameters to CodeWriter::block_begin/end so that this functionality is captured.
-
Chris Sullivan authored
* Added backprop op for relu and enabled tests.
-
- 21 Mar, 2018 4 commits
-
-
Jayaram Bobba authored
-
Yixing Lao authored
Adjust CallFrame argument order to match Function
-
Robert Kimball authored
* rename directories to be consistent * rename reference namespace to match directory
-
Jaikrishnan Menon authored
-
- 20 Mar, 2018 6 commits
-
-
Sandeep authored
* topolotical-sort based node clustering * cmake builds * Argon manager renamed to NNP along with placement * nnp dir cmake changes * tests pass * more renames * somemore renames * reslove redefination * revert to ARGON_API * more PR comments and remove nnp-fusion tests as redundant * update path * fix format
-
Adam Procter authored
-
Nick Korovaiko authored
* global tracing * fix compiler errors * nan/inf validation * 0644 on mkldnn_utils.cpp * address Bob's feedback * 0755 -> 0644 * remove format changes to python dir
-
Nick Korovaiko authored
* add visualize option to nbench * check for dot, amend help msg
-
Nick Korovaiko authored
* fix a segfault while printing shapes for multi-output ops
-
Nick Korovaiko authored
-
- 19 Mar, 2018 3 commits
-
-
Nick Korovaiko authored
-
Yixing Lao authored
-
Robert Kimball authored
-
- 18 Mar, 2018 1 commit
-
-
Nick Korovaiko authored
Contains multiple fixes to GetOutputElement, BatchNorm, autodiff, fprop_cache to integrate multi-output batchnorm and fprop_cache
-
- 17 Mar, 2018 1 commit
-
-
Jayaram Bobba authored
-
- 16 Mar, 2018 2 commits
- 15 Mar, 2018 5 commits
-
-
Robert Kimball authored
-
Jai Menon authored
-
Jayaram Bobba authored
-
Louis Feng authored
-
Robert Kimball authored
* add compile benchmark * add help when error
-
- 14 Mar, 2018 5 commits
-
-
Nick Korovaiko authored
* rough draft but needs to use get_n to get the right input * v2 fully working but hacky * remove hacks ; switch back build_users() to users() * rollback hacks to node.cpp * perms, remove prints, format
-
Chris Sullivan authored
* Added op::Relu and op::Not and enabled corresponding tests. * Removed softmax for now.
-
Fenglei authored
* add onehot op * refactor broadcast and onehot op
-
Chris Sullivan authored
* Added corresponding cudaFree to the cudaMalloc for the cuda pool_base_ptr memory buffer. * Check for temporary buffer allocation prior to freeing. Add null check on cudaFree.
-
Robert Kimball authored
* Add cpio file read/write class and unit tests add reserializer Add unit test for serialize constants to cpio file. Fix bug in serializer if function has no parameters.
-
- 13 Mar, 2018 6 commits
-
-
Jayaram Bobba authored
-
Robert Kimball authored
-
Jayaram Bobba authored
-
Chris Sullivan authored
* GPU elementwise emitters now respect input and output tensor types. This enables the use of binary comparison ops and op::Convert. * Removed comments. * All kernels now have type signature even if the i/o tensors are equivalent type so that kernels for specific type tensors are unique. NGMX-391 #close
-
Pruthvi authored
* Fix bn construtor - assert if gamma or beta dont have rank 1 - remove redundant checks * - added gaurds to check if the input and delta shape to mkldnn bn fprop and bprop op has a rank of 4
-
Chris Sullivan authored
* Updated namespace use in cpp files.
-