Commits · a86a9050fa433763d17a2f9af19a27dea932e579 · submodule / ngraph

14 Mar, 2018 4 commits

gpu add onehot op (#638) · a86a9050
Fenglei authored Mar 14, 2018
```
* add onehot op

* refactor broadcast and onehot op
```
a86a9050

Add missing cudaFree to emitted pool_base_ptr memory buffer (#640) · 9d89ffb9

Chris Sullivan authored Mar 14, 2018

* Added corresponding cudaFree to the cudaMalloc for the cuda pool_base_ptr memory buffer.

* Check for temporary buffer allocation prior to freeing. Add null check on cudaFree.

9d89ffb9

Yet another serialization option (#619) · 28602f31

Robert Kimball authored Mar 14, 2018

* Add cpio file read/write class and unit tests

add reserializer

Add unit test for serialize constants to cpio file. Fix bug in serializer if function has no parameters.

28602f31

Merge pull request #642 from NervanaSystems/jbobba/mkldnn_v0.13 · 9dea9576
Jayaram Bobba authored Mar 14, 2018
```
Jbobba/mkldnn v0.13
```
9dea9576

13 Mar, 2018 13 commits
- Merge branch 'master' into jbobba/mkldnn_v0.13 · ece8c213
  Jayaram Bobba authored Mar 13, 2018
  
  ece8c213
- Move to MKLDNN v0.13 · 124def0c
  Jayaram Bobba authored Mar 13, 2018
  
  124def0c
- Optimize batchnorm backprop by propagating input layouts instead of delta layouts · 1bc38b12
  Jayaram Bobba authored Mar 13, 2018
  
  1bc38b12
- make stuff const (#625) · 6a1b2ee2
  Robert Kimball authored Mar 13, 2018
  
  6a1b2ee2
- Cleanup the output descriptor of node arguments (#639) · 64a105db
  Jayaram Bobba authored Mar 13, 2018
  
  64a105db
- GPU elementwise emitters now respect input and output tensor types. (#633) · d3ea93e2
  Chris Sullivan authored Mar 13, 2018
```
* GPU elementwise emitters now respect input and output tensor types.
This enables the use of binary comparison ops and op::Convert.

* Removed comments.

* All kernels now have type signature
even if the i/o tensors are equivalent type so that
kernels for specific type tensors are unique.

NGMX-391 #close 
```
  d3ea93e2
- Fix bn constructor (#631) · 429eae9a
  Pruthvi authored Mar 13, 2018
```
* Fix bn construtor
    - assert if gamma or beta dont have rank 1
    - remove redundant checks

* - added gaurds to check if the input and delta shape to mkldnn bn fprop and bprop op has a rank of 4
```
  429eae9a
- Merge pull request #620 from NervanaSystems/tfl/gpu_dot_back · bb2b9516
  Fenglei authored Mar 13, 2018
```
gpu dot bug fix for bprop
```
  bb2b9516
- merge and resolve conflict with origin master · 3d53e58a
  fenglei.tian authored Mar 13, 2018
  
  3d53e58a
- Updated gpu cpp files with consistent use of namespaces (cosmetic) (#629) · b5467550
  Chris Sullivan authored Mar 13, 2018
```
* Updated namespace use in cpp files.
```
  b5467550
- fix old style cast error (#632) · a32fdab5
  Fenglei authored Mar 13, 2018
  
  a32fdab5
- Pruthvi/sigmoid bprop (#630) · 490e4e63
  Pruthvi authored Mar 13, 2018
```
* - Added pattern matcher for bprop sigmoid
- mkldnn emitter code for sigmoid bprop
- Fusion pass unit test for sigmoid bprop
- style fix

* Added test case for bprop sigmoid

* fixed sigmoid bprop test case failure

* fixed bprop unit test values for sigmoid

* style fix

* fix typo

* Addressed PR comments
- added layout assignment pass to ensure delta and input have same layout for SigmoidBprop
```
  490e4e63
- clang format · 39dc384d
  fenglei.tian authored Mar 13, 2018
  
  39dc384d
12 Mar, 2018 7 commits
- Merge branch 'master' into tfl/gpu_dot_back · 6667bc0e
  Fenglei authored Mar 12, 2018
  
  6667bc0e
- Merge pull request #623 from NervanaSystems/jbobba/batchnorm-inference · 34a8b27d
  Jayaram Bobba authored Mar 12, 2018
```
Batchnorm bprop layouts and move the last few mkldnn ops to mkldnn_emitter
```
  34a8b27d
- Merge branch 'master' into jbobba/batchnorm-inference · a95fe1ff
  Jayaram Bobba authored Mar 12, 2018
  
  a95fe1ff
- comments · eec717c0
  fenglei.tian authored Mar 12, 2018
  
  eec717c0
- Merge remote-tracking branch 'origin/master' into tfl/gpu_dot_back · 0b99a7a1
  fenglei.tian authored Mar 12, 2018
  
  0b99a7a1
- cleanup sign · bc4aefed
  fenglei.tian authored Mar 12, 2018
  
  bc4aefed
- Fixes compile errors on gcc 7.3. (#628) · 41a883b1
  Christian Convey authored Mar 12, 2018
  
  41a883b1
11 Mar, 2018 6 commits
- Enhancements to nbench (#622) · 33140eff
  Robert Kimball authored Mar 11, 2018
```
* fix detailed timing flag

* more detailed info
```
  33140eff
- update skip macro to output only if test is disabled (#626) · c7d6d7f1
  Robert Kimball authored Mar 11, 2018
  
  c7d6d7f1
- use op::Constant's data rather than emitting the data in the generated cpp code.… · 36a1d96f
  Robert Kimball authored Mar 11, 2018
```
use op::Constant's data rather than emitting the data in the generated cpp code. This make compile times for trained models something like 100x faster. (#624)
```
  36a1d96f
- Fix to matmul bias column broadcast and modified unit tests (#627) · 2f8b19a8
  Jayaram Bobba authored Mar 11, 2018
  
  2f8b19a8
- Style fix · 30135ca3
  Jayaram Bobba authored Mar 11, 2018
  
  30135ca3
- Remove mkldnn preamble from generated code. No longer needed with mkldnnn emitter · 815d84a4
  Jayaram Bobba authored Mar 11, 2018
  
  815d84a4
10 Mar, 2018 9 commits
- Merge branch 'master' into jbobba/batchnorm-inference · cf770aa5
  Jayaram Bobba authored Mar 10, 2018
  
  cf770aa5
- Merge pull request #606 from NervanaSystems/jbobba/maxpool-layouts · 8520e846
  Jayaram Bobba authored Mar 10, 2018
```
Add mkldnn layouts to Maxpool and Maxpoolbackprop
```
  8520e846
- Move Relu backprop to MKLDNN emitter · 7ce15121
  Jayaram Bobba authored Mar 10, 2018
  
  7ce15121
- Merge remote-tracking branch 'origin/master' into jbobba/batchnorm-inference · a5e29489
  Jayaram Bobba authored Mar 10, 2018
  
  a5e29489
- Added batchnorm bprop layouts and moved batchnorm ops to mkldnn emitter · da3184ec
  Jayaram Bobba authored Mar 10, 2018
  
  da3184ec
- Merge branch 'master' into jbobba/maxpool-layouts · f521db20
  Jayaram Bobba authored Mar 10, 2018
  
  f521db20
- clean sign op · 555acb7f
  fenglei.tian authored Mar 10, 2018
  
  555acb7f
- clang format · 939f9fae
  fenglei.tian authored Mar 10, 2018
  
  939f9fae
- merge with master, resolve conflict · a5457c97
  fenglei.tian authored Mar 10, 2018
  
  a5457c97
09 Mar, 2018 1 commit

Adding support for GPU elementwise ops for arbitrarily many inputs (#618) · 89da71d3

Chris Sullivan authored Mar 09, 2018

* Refactored unary elementwise ops into a single interface
that is adaptable to elementwise ops with arbitrary number of inputs.

* Renamed EmitUnaryElementwise -> EmitElementwise.
Implemented first binary elementwise op (Power).

* Refactored some of the boiler plate code for emitting cuda kernels to nvrtc
out of the emit functions and into the CudaFunctionPool static singleton.
CodeWriter now saves cuda kernels to ./gpu_codegen.

* Added ops Divide, Subtract & Sign to the GPU transformer.
Subtract and Sign both use custom device helper functions which
have math kernels defined for the op in gpu_cuda_kernel_ops.hpp,
and which are built by a new get_device_helper function.

89da71d3