Commits · 490e4e634322d99209f912269a7eba912444fb30 · submodule / ngraph

13 Mar, 2018 1 commit

Pruthvi authored Mar 13, 2018

* - Added pattern matcher for bprop sigmoid
- mkldnn emitter code for sigmoid bprop
- Fusion pass unit test for sigmoid bprop
- style fix

* Added test case for bprop sigmoid

* fixed sigmoid bprop test case failure

* fixed bprop unit test values for sigmoid

* style fix

* fix typo

* Addressed PR comments
- added layout assignment pass to ensure delta and input have same layout for SigmoidBprop

490e4e63

12 Mar, 2018 3 commits
- Merge pull request #623 from NervanaSystems/jbobba/batchnorm-inference · 34a8b27d
  Jayaram Bobba authored Mar 12, 2018
```
Batchnorm bprop layouts and move the last few mkldnn ops to mkldnn_emitter
```
  34a8b27d
- Merge branch 'master' into jbobba/batchnorm-inference · a95fe1ff
  Jayaram Bobba authored Mar 12, 2018
  
  a95fe1ff
- Fixes compile errors on gcc 7.3. (#628) · 41a883b1
  Christian Convey authored Mar 12, 2018
  
  41a883b1
11 Mar, 2018 6 commits
- Enhancements to nbench (#622) · 33140eff
  Robert Kimball authored Mar 11, 2018
```
* fix detailed timing flag

* more detailed info
```
  33140eff
- update skip macro to output only if test is disabled (#626) · c7d6d7f1
  Robert Kimball authored Mar 11, 2018
  
  c7d6d7f1
- use op::Constant's data rather than emitting the data in the generated cpp code.… · 36a1d96f
  Robert Kimball authored Mar 11, 2018
```
use op::Constant's data rather than emitting the data in the generated cpp code. This make compile times for trained models something like 100x faster. (#624)
```
  36a1d96f
- Fix to matmul bias column broadcast and modified unit tests (#627) · 2f8b19a8
  Jayaram Bobba authored Mar 11, 2018
  
  2f8b19a8
- Style fix · 30135ca3
  Jayaram Bobba authored Mar 11, 2018
  
  30135ca3
- Remove mkldnn preamble from generated code. No longer needed with mkldnnn emitter · 815d84a4
  Jayaram Bobba authored Mar 11, 2018
  
  815d84a4
10 Mar, 2018 6 commits
- Merge branch 'master' into jbobba/batchnorm-inference · cf770aa5
  Jayaram Bobba authored Mar 10, 2018
  
  cf770aa5
- Merge pull request #606 from NervanaSystems/jbobba/maxpool-layouts · 8520e846
  Jayaram Bobba authored Mar 10, 2018
```
Add mkldnn layouts to Maxpool and Maxpoolbackprop
```
  8520e846
- Move Relu backprop to MKLDNN emitter · 7ce15121
  Jayaram Bobba authored Mar 10, 2018
  
  7ce15121
- Merge remote-tracking branch 'origin/master' into jbobba/batchnorm-inference · a5e29489
  Jayaram Bobba authored Mar 10, 2018
  
  a5e29489
- Added batchnorm bprop layouts and moved batchnorm ops to mkldnn emitter · da3184ec
  Jayaram Bobba authored Mar 10, 2018
  
  da3184ec
- Merge branch 'master' into jbobba/maxpool-layouts · f521db20
  Jayaram Bobba authored Mar 10, 2018
  
  f521db20
09 Mar, 2018 10 commits

Adding support for GPU elementwise ops for arbitrarily many inputs (#618) · 89da71d3

Chris Sullivan authored Mar 09, 2018

* Refactored unary elementwise ops into a single interface
that is adaptable to elementwise ops with arbitrary number of inputs.

* Renamed EmitUnaryElementwise -> EmitElementwise.
Implemented first binary elementwise op (Power).

* Refactored some of the boiler plate code for emitting cuda kernels to nvrtc
out of the emit functions and into the CudaFunctionPool static singleton.
CodeWriter now saves cuda kernels to ./gpu_codegen.

* Added ops Divide, Subtract & Sign to the GPU transformer.
Subtract and Sign both use custom device helper functions which
have math kernels defined for the op in gpu_cuda_kernel_ops.hpp,
and which are built by a new get_device_helper function.

89da71d3

Merge pull request #540 from NervanaSystems/louisfeng/NGMX-296-conv_bias · 7ab47c2e
Louis Feng authored Mar 09, 2018
```
NGMX-296 Convolution + Bias with MKLDNN
```
7ab47c2e
Merge branch 'master' into louisfeng/NGMX-296-conv_bias · a5476da6
Louis Feng authored Mar 09, 2018

a5476da6
clang format · 362bb996
Louis Feng authored Mar 09, 2018

362bb996
Merge branch 'master' into louisfeng/NGMX-296-conv_bias · 1b3940eb
Louis Feng authored Mar 09, 2018
```
Also fixed conv+bias cpu layout bugs.
```
1b3940eb
mkldnn emitter - fixed memory leak. (#608) · 0acaea58
Louis Feng authored Mar 09, 2018
```
* fixed memory leak.

* clang format.
```
0acaea58
gpu emitter using template function (#610) · 95312b8e
Fenglei authored Mar 09, 2018
```
* update gpu_emitter use template

* add template
```
95312b8e
type_prop tests for batchnorm bprop (#601) · b3d2ff59
Nick Korovaiko authored Mar 09, 2018

b3d2ff59

Eliminate redundant copies due to op::Result (#612) · 4fc1a478

Nick Korovaiko authored Mar 09, 2018

* removing extra copies due to op::Result

* remove comment

* fix comment

* switch to a flag version

* add copyright header #pragma once

* add impl file, rename result_elimination.hpp to result_copy_elimination.hpp to match the opt name

* add cpp suffix to result_copy_elimination

* use member in-class member init

4fc1a478

Pruthvi/sigmoid (#614) · 5885c09a

Pruthvi authored Mar 09, 2018

* - Added sigmoid fusion pass
- added mkldnn emitter code for sigmoid

* - corrected sigmoid expected values
- add layout assignment for sigmoid op

* - added assert's in cpu fusion for sigmoid
- style fix

* remove debug prints

* NGMX-371 #comment addressed PR comments - Added sigmoid unit test case with 3D input ii) support in cpu_emmiter for sigmoid to handle all input shapes

* NGMX-371 #comment use shape_size() to calculate the 1d input size

5885c09a

08 Mar, 2018 14 commits

Style fix · a06f4520
Jayaram Bobba authored Mar 08, 2018

a06f4520
Merge remote-tracking branch 'origin/master' into jbobba/maxpool-layouts · 518bba03
Jayaram Bobba authored Mar 08, 2018

518bba03
Removed extraneous debug statement · eed7b313
Jayaram Bobba authored Mar 08, 2018

eed7b313
Merge pull request #613 from NervanaSystems/jbobba/batchnorm-layouts · e46184a1
Jayaram Bobba authored Mar 08, 2018
```
Jbobba/batchnorm layouts
```
e46184a1
Merge remote-tracking branch 'origin/master' into jbobba/batchnorm-layouts · a94d46d4
Jayaram Bobba authored Mar 08, 2018

a94d46d4
Fix copy/paste error on batchnorm op assignment · d6000754
Jayaram Bobba authored Mar 08, 2018

d6000754
added conv+bias to cpu layout pass. · 124d48ba
Louis Feng authored Mar 08, 2018

124d48ba
Use size_t to index into Node inputs/outputs · 9af9031e
Jayaram Bobba authored Mar 08, 2018

9af9031e

Optimize Broadcast in MatMulBias (#604) · 9cca4073

Nick Korovaiko authored Mar 08, 2018

* remove broadcast from matmulbias

* fix comments

* working gemm-based broadcast

* fix clang warning

9cca4073

Merge branch 'master' into jbobba/maxpool-layouts · 4203a832
Jayaram Bobba authored Mar 08, 2018

4203a832
Optimize MKLDNN filter conversions · bb06a619
Jayaram Bobba authored Mar 08, 2018

bb06a619
Add layout propagation to MKLDNN batchnorm and GetOutputElement · 0f9119c8
Jayaram Bobba authored Mar 08, 2018

0f9119c8

Abstraction for GPU unary elementwise ops (#587) · 529362b5

Chris Sullivan authored Mar 08, 2018

* straightforward gpu.cos implementation following previous patterns prior to refactor

* Generalized unary elementwise gpu op impl.. New unary elementwise ops can
be added to the type annotations in gpu_cuda_kernel_ops.hpp. Next step
is to refactor the llvm interface in gpu_emitters.hpp for similar generality.

* Added gpu_emitter.hpp:EmitUnaryElementwise.

Function adds cuda kernel based on ngraph::op::op_type::description.
This can service all unary elementwise ops run on the gpu.

* The following elementwise unary ops now use the EmitUnaryElementwise emitter:
* GPU.abs
* GPU.acos
* GPU.asin
* GPU.atan
* GPU.ceiling
* GPU.cos
* GPU.cosh
* GPU.exp
* GPU.floor
* GPU.log
* GPU.not
* GPU.sign
* GPU.sin
* GPU.sinh
* GPU.tan
* GPU.tanh
Unary elementwise ops Sign and Not need extra consideration.

* tanh test changed to test::all_close for fp comparison (also done for tan in commit 65fa7c6de34c8277fe2a4801644f6bb64574f4ff).

* GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.

* code format update

* changed cuda builder interface names to unary/binary/arbitrary, added impl. note to gpu_cuda_kernel_ops, cleaned code format

* updated ngraph-cpp reference

* Fixing incorrect github conflict resolution.

* Added GPU emitter for op::Result.
For now it simply copies the output tensor.

All but 3 tests now pass. The remaining
failing tests are:
* GPU.dot_0_0
* GPU.dot_matrix_2x0_0x2
* GPU.dot_2x0_0

* Removed call to handle memory aliasing in gpu_external_function.

* fix gpu emitter bug that will return in the middle of function

* Merge pull request #609 from NervanaSystems/tfl/fix_return_bug

fix gpu emitter bug that will return in the middle of function

* GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.

529362b5

Merge pull request #599 from NervanaSystems/tfl/gpu_fix_constant_bug · a02aab01
Fenglei authored Mar 08, 2018
```
Fix constant bug on GPU
```
a02aab01