Commits · 9cca40731c869acefc6e5635405bd13174ded55b · submodule / ngraph

08 Mar, 2018 5 commits

Optimize Broadcast in MatMulBias (#604) · 9cca4073

Nick Korovaiko authored Mar 08, 2018

* remove broadcast from matmulbias

* fix comments

* working gemm-based broadcast

* fix clang warning

9cca4073

Abstraction for GPU unary elementwise ops (#587) · 529362b5

Chris Sullivan authored Mar 08, 2018

* straightforward gpu.cos implementation following previous patterns prior to refactor

* Generalized unary elementwise gpu op impl.. New unary elementwise ops can
be added to the type annotations in gpu_cuda_kernel_ops.hpp. Next step
is to refactor the llvm interface in gpu_emitters.hpp for similar generality.

* Added gpu_emitter.hpp:EmitUnaryElementwise.

Function adds cuda kernel based on ngraph::op::op_type::description.
This can service all unary elementwise ops run on the gpu.

* The following elementwise unary ops now use the EmitUnaryElementwise emitter:
* GPU.abs
* GPU.acos
* GPU.asin
* GPU.atan
* GPU.ceiling
* GPU.cos
* GPU.cosh
* GPU.exp
* GPU.floor
* GPU.log
* GPU.not
* GPU.sign
* GPU.sin
* GPU.sinh
* GPU.tan
* GPU.tanh
Unary elementwise ops Sign and Not need extra consideration.

* tanh test changed to test::all_close for fp comparison (also done for tan in commit 65fa7c6de34c8277fe2a4801644f6bb64574f4ff).

* GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.

* code format update

* changed cuda builder interface names to unary/binary/arbitrary, added impl. note to gpu_cuda_kernel_ops, cleaned code format

* updated ngraph-cpp reference

* Fixing incorrect github conflict resolution.

* Added GPU emitter for op::Result.
For now it simply copies the output tensor.

All but 3 tests now pass. The remaining
failing tests are:
* GPU.dot_0_0
* GPU.dot_matrix_2x0_0x2
* GPU.dot_2x0_0

* Removed call to handle memory aliasing in gpu_external_function.

* fix gpu emitter bug that will return in the middle of function

* Merge pull request #609 from NervanaSystems/tfl/fix_return_bug

fix gpu emitter bug that will return in the middle of function

* GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.

529362b5

Merge pull request #599 from NervanaSystems/tfl/gpu_fix_constant_bug · a02aab01
Fenglei authored Mar 08, 2018
```
Fix constant bug on GPU
```
a02aab01
Merge branch 'master' into tfl/gpu_fix_constant_bug · 61fa9d55
Robert Kimball authored Mar 08, 2018

61fa9d55

GPU op::Result implementation (#611) · 905cafd2

Chris Sullivan authored Mar 08, 2018

* Added GPU emitter for op::Result.
For now it simply copies the output tensor.

All but 3 tests now pass. The remaining
failing tests are:
* GPU.dot_0_0
* GPU.dot_matrix_2x0_0x2
* GPU.dot_2x0_0

* Removed call to handle memory aliasing in gpu_external_function.

* fix gpu emitter bug that will return in the middle of function

* Merge pull request #609 from NervanaSystems/tfl/fix_return_bug

fix gpu emitter bug that will return in the middle of function

* GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.

905cafd2

07 Mar, 2018 5 commits

Merge branch 'master' into tfl/gpu_fix_constant_bug · eee71968
Chris Sullivan authored Mar 07, 2018

eee71968

bn fprop mkldnn optimized implementation (#581) · 9db548c6

Pruthvi authored Mar 07, 2018

* - Added support optimized bn mkldnn implementation in cpu emitter
- modified bn unit_test to support new implementation
- added layout assignment for bn op
- Style Fix

(cherry picked from commit 7747a40806d62c126059d5c873adcd2e61a0adb0)

* modified value initilization in cpu_fusion to be float explicit

(cherry picked from commit 03499d380073d0197ab8cbd154eb03f63b042a48)

* fix compilation issue

* Addressed PR comments
- added exception if gamma and beta layout isnot equal to memory::format::x
- throw exception if bn Op is not mkldnn op

* fix compilation issue

* added support to handle multiple o/ps in fprop bn fusion

* - Removed laytout pass for bn
- fixed autodiff bug in bn
- added "Add" for the dispatcher in cpu-layout pass

* style fix

* Fix bprop batchnorm test with get_output_elements

* Style fix

9db548c6

Remove duplicate and unnused declarations (#607) · f2e6b48b
Scott Cyphers authored Mar 07, 2018

f2e6b48b
More detail around use of j in preparing to build nGraph with make (#605) · eb35534e
L.S. Cook authored Mar 07, 2018

eb35534e
CPU: No-op elimination pass (#603) · 4b41aca2
Jai Menon authored Mar 07, 2018

4b41aca2

06 Mar, 2018 13 commits
- Zero-padded convolution fusion (#596) · ad58cb29
  Jai Menon authored Mar 06, 2018
```
* CPU: Padded Convolution fusion

* CPU: Non-reshaped fusion pattern for zero-padded convolutions

* CPU: Refactor consistency checks

* CPU: Rewrite hoisted reshape expression and add tests

* CPU: Merge leftovers
```
  ad58cb29
- resolved remaining reference to private-ngraph-cpp repo name in INSTALL file (#595) · 6f011bc2
  DawnStone authored Mar 06, 2018
  
  6f011bc2
- Generalize MatMulBias (2nd attempt) (#597) · 55d11bb4
  Nick Korovaiko authored Mar 06, 2018
```
* generalize matmulbias

fixes

disable logging

* unit-test failures
```
  55d11bb4
- op::Result ver3 (#594) · 5c7e9844
  Nick Korovaiko authored Mar 06, 2018
```
* the first stab at op::Result

format fixes

disabling logging

op::Result, 2nd attempt

purge stale code

disable logging

fix copyright header

* initial cleanup

* cleanup2

* remove dead code

* result.cpp, fix comments

* fix comment
```
  5c7e9844
- Merge branch 'master' into tfl/gpu_fix_constant_bug · 181be216
  Fenglei authored Mar 06, 2018
  
  181be216
- Patch json library to support older versions of gcc (#598) · 456db623
  Robert Kimball authored Mar 06, 2018
```
* patch working

* wip

* fix patcher

* remove debug message:

* cleanup

* fix typo
```
  456db623
- indent · 843e83a2
  fenglei.tian authored Mar 06, 2018
  
  843e83a2
- Merge branch 'tfl/gpu_fix_constant_bug' of github.com:NervanaSystems/ngraph-cpp… · ee463b66
  fenglei.tian authored Mar 06, 2018
```
Merge branch 'tfl/gpu_fix_constant_bug' of github.com:NervanaSystems/ngraph-cpp into tfl/gpu_fix_constant_bug
```
  ee463b66
- clang format · 24b72581
  fenglei.tian authored Mar 06, 2018
  
  24b72581
- Merge branch 'master' into tfl/gpu_fix_constant_bug · 20e2a098
  Fenglei authored Mar 06, 2018
  
  20e2a098
- gpu broadcast (#576) · 41268068
  Fenglei authored Mar 06, 2018
```
* add gpu broadcast

* add broadcast kernel

* fix bug for cumemdopyDtD usage in gpu_external_function.cpp
```
  41268068
- fix constant bug · 8d0768c5
  fenglei.tian authored Mar 06, 2018
  
  8d0768c5
- Softmax added to ToC, edit install for better rst syntax (#593) · ae50019e
  L.S. Cook authored Mar 06, 2018
  
  ae50019e
05 Mar, 2018 9 commits
- limit default parallel processes for contrib/docker/Makefile (#582) · dd3de248
  DawnStone authored Mar 05, 2018
```
* limited parallel make processes to make -j 16 by default for contrib/docker/Makefile

* set the default to make -j 22 for parallel make in contrib/docker/Makefile
```
  dd3de248
- testing hard-coded path for libcuda.so (#590) · c7acee84
  DawnStone authored Mar 05, 2018
  
  c7acee84
- Repo name change and other miscellaneous edits (#588) · 11344652
  L.S. Cook authored Mar 05, 2018
  
  11344652
- expand when mkldnn relu can be used, add faster default kernels (#592) · ce8fef72
  Matthew Brookhart authored Mar 05, 2018
  
  ce8fef72
- rework autodiff of tanh for stability on LC (#591) · ca06e6c3
  Matthew Brookhart authored Mar 05, 2018
  
  ca06e6c3
- modified references to the repo to reflect ngraph-cpp (#589) · 22347363
  DawnStone authored Mar 05, 2018
  
  22347363
- Missing include (#586) · ef8b5399
  Scott Cyphers authored Mar 05, 2018
  
  ef8b5399
- AllReduce reduced the syntax page title a bit too much (#585) · acde10c6
  L.S. Cook authored Mar 05, 2018
  
  acde10c6
- Include cleanup (#583) · cec89708
  Robert Kimball authored Mar 05, 2018
```
* cleanup

* cleanup

* fix all headers to be standalone as far as includes go

* include cleanup

* cleanup includes

* cleanup

* include tester

* wip

* cleanup

* cleanup

* cleanup
```
  cec89708
04 Mar, 2018 1 commit
- Missing include (#584) · ca54c986
  Scott Cyphers authored Mar 04, 2018
  
  ca54c986
03 Mar, 2018 2 commits
- Indie patch 1 (#579) · 0e630291
  L.S. Cook authored Mar 03, 2018
```
* Delete index.xsd

* Remove files added before gitignore was updated.
```
  0e630291
- Edit index page for new howto section (#578) · 536342f1
  L.S. Cook authored Mar 03, 2018
  
  536342f1
02 Mar, 2018 5 commits
- add softmax op (#542) · 0c43f175
  adstraw authored Mar 02, 2018
```
add softmax op and documentation
```
  0c43f175
- bug fix for cuda_memcpyDtD interface change (#580) · 498fbefd
  Fenglei authored Mar 02, 2018
  
  498fbefd
- Batchnorm Bprop v2 (#567) · e4b90a9c
  Nick Korovaiko authored Mar 02, 2018
```
* one output

multiple outputs

initial clean-up

* test clean-up

current version

test pass

* clean up

* fix format

* add dbeta,dgamma asserts

* revert some files

* 0644 on node.cpp

* 0644 on mkldnn_utils.cpp

* 0644 on more files

* add support for serialization + test case

* fix merge errors
```
  e4b90a9c
- Add aliased Constants to aliased_output test. (#555) · 355bff8f
  Sang Ik Lee authored Mar 02, 2018
```
* Add aliased Constants to aliased_output test.

* add support for const as outputs
```
  355bff8f
- Merge pull request #556 from NervanaSystems/cyphers/dochow · d991861d
  L.S. Cook authored Mar 02, 2018
```
Cyphers/dochow
```
  d991861d