Commits · 8fc9dd698fbcd43e7233bc5975eff661066028f4 · submodule / ngraph

14 Mar, 2018 3 commits

Added op::Relu and op::Not to GPU transformer and enabled corresponding tests (#641) · e7abc0f3
Chris Sullivan authored 6 years ago
```
* Added op::Relu and op::Not and enabled corresponding tests.

* Removed softmax for now.
```
e7abc0f3
gpu add onehot op (#638) · a86a9050
Fenglei authored 6 years ago
```
* add onehot op

* refactor broadcast and onehot op
```
a86a9050

Yet another serialization option (#619) · 28602f31

Robert Kimball authored 6 years ago

* Add cpio file read/write class and unit tests

add reserializer

Add unit test for serialize constants to cpio file. Fix bug in serializer if function has no parameters.

28602f31

13 Mar, 2018 2 commits

GPU elementwise emitters now respect input and output tensor types. (#633) · d3ea93e2

Chris Sullivan authored 6 years ago

* GPU elementwise emitters now respect input and output tensor types.
This enables the use of binary comparison ops and op::Convert.

* Removed comments.

* All kernels now have type signature
even if the i/o tensors are equivalent type so that
kernels for specific type tensors are unique.

NGMX-391 #close

d3ea93e2

Pruthvi/sigmoid bprop (#630) · 490e4e63

Pruthvi authored 6 years ago

* - Added pattern matcher for bprop sigmoid
- mkldnn emitter code for sigmoid bprop
- Fusion pass unit test for sigmoid bprop
- style fix

* Added test case for bprop sigmoid

* fixed sigmoid bprop test case failure

* fixed bprop unit test values for sigmoid

* style fix

* fix typo

* Addressed PR comments
- added layout assignment pass to ensure delta and input have same layout for SigmoidBprop

490e4e63

11 Mar, 2018 3 commits
- Enhancements to nbench (#622) · 33140eff
  Robert Kimball authored 6 years ago
```
* fix detailed timing flag

* more detailed info
```
  33140eff
- update skip macro to output only if test is disabled (#626) · c7d6d7f1
  Robert Kimball authored 6 years ago
  
  c7d6d7f1
- Fix to matmul bias column broadcast and modified unit tests (#627) · 2f8b19a8
  Jayaram Bobba authored 6 years ago
  
  2f8b19a8
09 Mar, 2018 5 commits

Adding support for GPU elementwise ops for arbitrarily many inputs (#618) · 89da71d3

Chris Sullivan authored 6 years ago

* Refactored unary elementwise ops into a single interface
that is adaptable to elementwise ops with arbitrary number of inputs.

* Renamed EmitUnaryElementwise -> EmitElementwise.
Implemented first binary elementwise op (Power).

* Refactored some of the boiler plate code for emitting cuda kernels to nvrtc
out of the emit functions and into the CudaFunctionPool static singleton.
CodeWriter now saves cuda kernels to ./gpu_codegen.

* Added ops Divide, Subtract & Sign to the GPU transformer.
Subtract and Sign both use custom device helper functions which
have math kernels defined for the op in gpu_cuda_kernel_ops.hpp,
and which are built by a new get_device_helper function.

89da71d3

clang format · 362bb996
Louis Feng authored 6 years ago

362bb996
fix bug for 2d2d2 dot, enable some bprop dot tests · 9fd64b6f
fenglei.tian authored 6 years ago

9fd64b6f
type_prop tests for batchnorm bprop (#601) · b3d2ff59
Nick Korovaiko authored 6 years ago

b3d2ff59

Pruthvi/sigmoid (#614) · 5885c09a

Pruthvi authored 6 years ago

* - Added sigmoid fusion pass
- added mkldnn emitter code for sigmoid

* - corrected sigmoid expected values
- add layout assignment for sigmoid op

* - added assert's in cpu fusion for sigmoid
- style fix

* remove debug prints

* NGMX-371 #comment addressed PR comments - Added sigmoid unit test case with 3D input ii) support in cpu_emmiter for sigmoid to handle all input shapes

* NGMX-371 #comment use shape_size() to calculate the 1d input size

5885c09a

08 Mar, 2018 5 commits

enable supported backward tests · dd5c77e0
fenglei.tian authored 6 years ago

dd5c77e0
add sign op, fix constant bug · dd5a6769
fenglei.tian authored 6 years ago

dd5a6769

Optimize Broadcast in MatMulBias (#604) · 9cca4073

Nick Korovaiko authored 6 years ago

* remove broadcast from matmulbias

* fix comments

* working gemm-based broadcast

* fix clang warning

9cca4073

Abstraction for GPU unary elementwise ops (#587) · 529362b5

Chris Sullivan authored 6 years ago

* straightforward gpu.cos implementation following previous patterns prior to refactor

* Generalized unary elementwise gpu op impl.. New unary elementwise ops can
be added to the type annotations in gpu_cuda_kernel_ops.hpp. Next step
is to refactor the llvm interface in gpu_emitters.hpp for similar generality.

* Added gpu_emitter.hpp:EmitUnaryElementwise.

Function adds cuda kernel based on ngraph::op::op_type::description.
This can service all unary elementwise ops run on the gpu.

* The following elementwise unary ops now use the EmitUnaryElementwise emitter:
* GPU.abs
* GPU.acos
* GPU.asin
* GPU.atan
* GPU.ceiling
* GPU.cos
* GPU.cosh
* GPU.exp
* GPU.floor
* GPU.log
* GPU.not
* GPU.sign
* GPU.sin
* GPU.sinh
* GPU.tan
* GPU.tanh
Unary elementwise ops Sign and Not need extra consideration.

* tanh test changed to test::all_close for fp comparison (also done for tan in commit 65fa7c6de34c8277fe2a4801644f6bb64574f4ff).

* GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.

* code format update

* changed cuda builder interface names to unary/binary/arbitrary, added impl. note to gpu_cuda_kernel_ops, cleaned code format

* updated ngraph-cpp reference

* Fixing incorrect github conflict resolution.

* Added GPU emitter for op::Result.
For now it simply copies the output tensor.

All but 3 tests now pass. The remaining
failing tests are:
* GPU.dot_0_0
* GPU.dot_matrix_2x0_0x2
* GPU.dot_2x0_0

* Removed call to handle memory aliasing in gpu_external_function.

* fix gpu emitter bug that will return in the middle of function

* Merge pull request #609 from NervanaSystems/tfl/fix_return_bug

fix gpu emitter bug that will return in the middle of function

* GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.

529362b5

GPU op::Result implementation (#611) · 905cafd2

Chris Sullivan authored 6 years ago

* Added GPU emitter for op::Result.
For now it simply copies the output tensor.

All but 3 tests now pass. The remaining
failing tests are:
* GPU.dot_0_0
* GPU.dot_matrix_2x0_0x2
* GPU.dot_2x0_0

* Removed call to handle memory aliasing in gpu_external_function.

* fix gpu emitter bug that will return in the middle of function

* Merge pull request #609 from NervanaSystems/tfl/fix_return_bug

fix gpu emitter bug that will return in the middle of function

* GPU backend skips added for recent softmax test and updated aliased output test that uses op::Constant.

905cafd2

07 Mar, 2018 7 commits

bn fprop mkldnn optimized implementation (#581) · 9db548c6

Pruthvi authored 6 years ago

* - Added support optimized bn mkldnn implementation in cpu emitter
- modified bn unit_test to support new implementation
- added layout assignment for bn op
- Style Fix

(cherry picked from commit 7747a40806d62c126059d5c873adcd2e61a0adb0)

* modified value initilization in cpu_fusion to be float explicit

(cherry picked from commit 03499d380073d0197ab8cbd154eb03f63b042a48)

* fix compilation issue

* Addressed PR comments
- added exception if gamma and beta layout isnot equal to memory::format::x
- throw exception if bn Op is not mkldnn op

* fix compilation issue

* added support to handle multiple o/ps in fprop bn fusion

* - Removed laytout pass for bn
- fixed autodiff bug in bn
- added "Add" for the dispatcher in cpu-layout pass

* style fix

* Fix bprop batchnorm test with get_output_elements

* Style fix

9db548c6

Remove duplicate and unnused declarations (#607) · f2e6b48b
Scott Cyphers authored 6 years ago

f2e6b48b
clang format. · d37b30ad
Louis Feng authored 6 years ago

d37b30ad
clean up. · 338b9622
Louis Feng authored 6 years ago

338b9622
simplify convbias test. · 812a699a
Louis Feng authored 6 years ago

812a699a
refactor and clean up. · 8b7f042d
Louis Feng authored 6 years ago

8b7f042d
more tests. · 97c2ce20
Louis Feng authored 6 years ago

97c2ce20

06 Mar, 2018 5 commits

Zero-padded convolution fusion (#596) · ad58cb29

Jai Menon authored 6 years ago

* CPU: Padded Convolution fusion

* CPU: Non-reshaped fusion pattern for zero-padded convolutions

* CPU: Refactor consistency checks

* CPU: Rewrite hoisted reshape expression and add tests

* CPU: Merge leftovers

ad58cb29

Generalize MatMulBias (2nd attempt) (#597) · 55d11bb4
Nick Korovaiko authored 6 years ago
```
* generalize matmulbias

fixes

disable logging

* unit-test failures
```
55d11bb4

op::Result ver3 (#594) · 5c7e9844

Nick Korovaiko authored 6 years ago

* the first stab at op::Result

format fixes

disabling logging

op::Result, 2nd attempt

purge stale code

disable logging

fix copyright header

* initial cleanup

* cleanup2

* remove dead code

* result.cpp, fix comments

* fix comment

5c7e9844

test wip. · 81fe53cd
Louis Feng authored 6 years ago

81fe53cd

gpu broadcast (#576) · 41268068

Fenglei authored 6 years ago

* add gpu broadcast

* add broadcast kernel

* fix bug for cumemdopyDtD usage in gpu_external_function.cpp

41268068

05 Mar, 2018 1 commit

Include cleanup (#583) · cec89708

Robert Kimball authored 6 years ago

* cleanup

* cleanup

* fix all headers to be standalone as far as includes go

* include cleanup

* cleanup includes

* cleanup

* include tester

* wip

* cleanup

* cleanup

* cleanup

cec89708

02 Mar, 2018 6 commits

add softmax op (#542) · 0c43f175
adstraw authored 6 years ago
```
add softmax op and documentation
```
0c43f175

Batchnorm Bprop v2 (#567) · e4b90a9c

Nick Korovaiko authored 6 years ago

* one output

multiple outputs

initial clean-up

* test clean-up

current version

test pass

* clean up

* fix format

* add dbeta,dgamma asserts

* revert some files

* 0644 on node.cpp

* 0644 on mkldnn_utils.cpp

* 0644 on more files

* add support for serialization + test case

* fix merge errors

e4b90a9c

Add aliased Constants to aliased_output test. (#555) · 355bff8f
Sang Ik Lee authored 6 years ago
```
* Add aliased Constants to aliased_output test.

* add support for const as outputs
```
355bff8f

Correctly handle Function/Node names for codegen (#570) · 8e5c9404

Robert Kimball authored 6 years ago

* named uniquely

* rename get_name methods

* add unit test for function name

* add node name unit test

* make unique name const

8e5c9404

fix warnings (#571) · 5d973a6e
Robert Kimball authored 6 years ago

5d973a6e
add gpu broadcast · e1b2f54c
fenglei.tian authored 6 years ago

e1b2f54c

01 Mar, 2018 2 commits
- Memory Leak with External Function (#568) · 3da0e440
  Robert Kimball authored 6 years ago
```
* add unit test for resource deallocation

fix leak

* cleanup
```
  3da0e440
- fix bug and enable some reshape tests (#565) · f4ff1c3b
  Fenglei authored 6 years ago
```
* fix bug and enable some tests

* eliminate duplicated code, change some parameter names
```
  f4ff1c3b
28 Feb, 2018 1 commit
- make serializer header not depend on json.hpp (#562) · ebdca8d8
  Robert Kimball authored 6 years ago
```
This fixes the previously broken compilation of ngraph-mxnet which directly utilized serializer.hpp
```
  ebdca8d8