Commits · bc4aefeda136f9cd6096cb0d567024ded196badf · submodule / ngraph

12 Mar, 2018 1 commit
- cleanup sign · bc4aefed
  fenglei.tian authored Mar 12, 2018
  
  bc4aefed
10 Mar, 2018 3 commits
- clean sign op · 555acb7f
  fenglei.tian authored Mar 10, 2018
  
  555acb7f
- clang format · 939f9fae
  fenglei.tian authored Mar 10, 2018
  
  939f9fae
- merge with master, resolve conflict · a5457c97
  fenglei.tian authored Mar 10, 2018
  
  a5457c97
09 Mar, 2018 12 commits

Adding support for GPU elementwise ops for arbitrarily many inputs (#618) · 89da71d3

Chris Sullivan authored Mar 09, 2018

* Refactored unary elementwise ops into a single interface
that is adaptable to elementwise ops with arbitrary number of inputs.

* Renamed EmitUnaryElementwise -> EmitElementwise.
Implemented first binary elementwise op (Power).

* Refactored some of the boiler plate code for emitting cuda kernels to nvrtc
out of the emit functions and into the CudaFunctionPool static singleton.
CodeWriter now saves cuda kernels to ./gpu_codegen.

* Added ops Divide, Subtract & Sign to the GPU transformer.
Subtract and Sign both use custom device helper functions which
have math kernels defined for the op in gpu_cuda_kernel_ops.hpp,
and which are built by a new get_device_helper function.

89da71d3

Merge pull request #540 from NervanaSystems/louisfeng/NGMX-296-conv_bias · 7ab47c2e
Louis Feng authored Mar 09, 2018
```
NGMX-296 Convolution + Bias with MKLDNN
```
7ab47c2e
Merge branch 'master' into louisfeng/NGMX-296-conv_bias · a5476da6
Louis Feng authored Mar 09, 2018

a5476da6
clang format · 362bb996
Louis Feng authored Mar 09, 2018

362bb996
Merge branch 'master' into louisfeng/NGMX-296-conv_bias · 1b3940eb
Louis Feng authored Mar 09, 2018
```
Also fixed conv+bias cpu layout bugs.
```
1b3940eb
mkldnn emitter - fixed memory leak. (#608) · 0acaea58
Louis Feng authored Mar 09, 2018
```
* fixed memory leak.

* clang format.
```
0acaea58
Merge branch 'master' into tfl/gpu_dot_back · 8f5b3e2e
Fenglei authored Mar 09, 2018

8f5b3e2e
gpu emitter using template function (#610) · 95312b8e
Fenglei authored Mar 09, 2018
```
* update gpu_emitter use template

* add template
```
95312b8e
fix bug for 2d2d2 dot, enable some bprop dot tests · 9fd64b6f
fenglei.tian authored Mar 09, 2018

9fd64b6f
type_prop tests for batchnorm bprop (#601) · b3d2ff59
Nick Korovaiko authored Mar 09, 2018

b3d2ff59

Eliminate redundant copies due to op::Result (#612) · 4fc1a478

Nick Korovaiko authored Mar 09, 2018

* removing extra copies due to op::Result

* remove comment

* fix comment

* switch to a flag version

* add copyright header #pragma once

* add impl file, rename result_elimination.hpp to result_copy_elimination.hpp to match the opt name

* add cpp suffix to result_copy_elimination

* use member in-class member init

4fc1a478

Pruthvi/sigmoid (#614) · 5885c09a

Pruthvi authored Mar 09, 2018

* - Added sigmoid fusion pass
- added mkldnn emitter code for sigmoid

* - corrected sigmoid expected values
- add layout assignment for sigmoid op

* - added assert's in cpu fusion for sigmoid
- style fix

* remove debug prints

* NGMX-371 #comment addressed PR comments - Added sigmoid unit test case with 3D input ii) support in cpu_emmiter for sigmoid to handle all input shapes

* NGMX-371 #comment use shape_size() to calculate the 1d input size

5885c09a

08 Mar, 2018 23 commits

enable supported backward tests · dd5c77e0
fenglei.tian authored Mar 08, 2018

dd5c77e0
add sign op, fix constant bug · dd5a6769
fenglei.tian authored Mar 08, 2018

dd5a6769
Merge pull request #613 from NervanaSystems/jbobba/batchnorm-layouts · e46184a1
Jayaram Bobba authored Mar 08, 2018
```
Jbobba/batchnorm layouts
```
e46184a1
Merge remote-tracking branch 'origin/master' into jbobba/batchnorm-layouts · a94d46d4
Jayaram Bobba authored Mar 08, 2018

a94d46d4
Fix copy/paste error on batchnorm op assignment · d6000754
Jayaram Bobba authored Mar 08, 2018

d6000754
added conv+bias to cpu layout pass. · 124d48ba
Louis Feng authored Mar 08, 2018

124d48ba
Use size_t to index into Node inputs/outputs · 9af9031e
Jayaram Bobba authored Mar 08, 2018

9af9031e

Optimize Broadcast in MatMulBias (#604) · 9cca4073

Nick Korovaiko authored Mar 08, 2018

* remove broadcast from matmulbias

* fix comments

* working gemm-based broadcast

* fix clang warning

9cca4073

clang format · b5414ba5
fenglei.tian authored Mar 08, 2018
```
:
```
b5414ba5

Merge branch 'tfl/gpu_emitter_template' of… · 6204a154

Fenglei Tian authored Mar 08, 2018

Merge branch 'tfl/gpu_emitter_template' of github.com:NervanaSystems/private-ngraph-cpp into tfl/gpu_emitter_template

6204a154

namespace · 2e94fe52
Fenglei Tian authored Mar 08, 2018

2e94fe52
clang format · d7039a34
fenglei.tian authored Mar 08, 2018

d7039a34
resolve conflict · 5b5bb51b
Fenglei Tian authored Mar 08, 2018

5b5bb51b
merge master and resolve conflict · 0a77e3d9
Fenglei Tian authored Mar 08, 2018

0a77e3d9
Optimize MKLDNN filter conversions · bb06a619
Jayaram Bobba authored Mar 08, 2018

bb06a619
Add layout propagation to MKLDNN batchnorm and GetOutputElement · 0f9119c8
Jayaram Bobba authored Mar 08, 2018

0f9119c8

Abstraction for GPU unary elementwise ops (#587) · 529362b5

Chris Sullivan authored Mar 08, 2018

* straightforward gpu.cos implementation following previous patterns prior to refactor

* Generalized unary elementwise gpu op impl.. New unary elementwise ops can
be added to the type annotations in gpu_cuda_kernel_ops.hpp. Next step
is to refactor the llvm interface in gpu_emitters.hpp for similar generality.

* Added gpu_emitter.hpp:EmitUnaryElementwise.

Function adds cuda kernel based on ngraph::op::op_type::description.
This can service all unary elementwise ops run on the gpu.

* The following elementwise unary ops now use the EmitUnaryElementwise emitter:
* GPU.abs
* GPU.acos
* GPU.asin
* GPU.atan
* GPU.ceiling
* GPU.cos
* GPU.cosh
* GPU.exp
* GPU.floor
* GPU.log
* GPU.not
* GPU.sign
* GPU.sin
* GPU.sinh
* GPU.tan
* GPU.tanh
Unary elementwise ops Sign and Not need extra consideration.

* tanh test changed to test::all_close for fp comparison (also done for tan in commit 65fa7c6de34c8277fe2a4801644f6bb64574f4ff).

* GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.

* code format update

* changed cuda builder interface names to unary/binary/arbitrary, added impl. note to gpu_cuda_kernel_ops, cleaned code format

* updated ngraph-cpp reference

* Fixing incorrect github conflict resolution.

* Added GPU emitter for op::Result.
For now it simply copies the output tensor.

All but 3 tests now pass. The remaining
failing tests are:
* GPU.dot_0_0
* GPU.dot_matrix_2x0_0x2
* GPU.dot_2x0_0

* Removed call to handle memory aliasing in gpu_external_function.

* fix gpu emitter bug that will return in the middle of function

* Merge pull request #609 from NervanaSystems/tfl/fix_return_bug

fix gpu emitter bug that will return in the middle of function

* GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.

529362b5

remove unused variable · 998d7c6b
fenglei.tian authored Mar 08, 2018

998d7c6b
fix merge bug and apply clang format · 2e295d27
fenglei.tian authored Mar 08, 2018

2e295d27
resolve coflict when merge master · 809dda4f
Fenglei Tian authored Mar 08, 2018

809dda4f
Merge pull request #599 from NervanaSystems/tfl/gpu_fix_constant_bug · a02aab01
Fenglei authored Mar 08, 2018
```
Fix constant bug on GPU
```
a02aab01
Merge branch 'master' into tfl/gpu_fix_constant_bug · 61fa9d55
Robert Kimball authored Mar 08, 2018

61fa9d55

GPU op::Result implementation (#611) · 905cafd2

Chris Sullivan authored Mar 08, 2018

* Added GPU emitter for op::Result.
For now it simply copies the output tensor.

All but 3 tests now pass. The remaining
failing tests are:
* GPU.dot_0_0
* GPU.dot_matrix_2x0_0x2
* GPU.dot_2x0_0

* Removed call to handle memory aliasing in gpu_external_function.

* fix gpu emitter bug that will return in the middle of function

* Merge pull request #609 from NervanaSystems/tfl/fix_return_bug

fix gpu emitter bug that will return in the middle of function

* GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.

905cafd2

07 Mar, 2018 1 commit
- Merge branch 'master' into tfl/gpu_fix_constant_bug · eee71968
  Chris Sullivan authored Mar 07, 2018
  
  eee71968