Commits · a855a3ad7a6deaf40947440ac83bfe060a2fb7d8 · submodule / ngraph · GitLab

22 Mar, 2018 7 commits

Make MatMulBias aware of addition commutativity (#713) · a855a3ad
Nick Korovaiko authored 6 years ago
```
* make matmulbias callback aware that addition is commutative
```
a855a3ad
Remove XLA compatibility figleaves (will be moved to ngraph-tensorflow-bridge) (#704) · d73f92c4
Adam Procter authored 6 years ago

d73f92c4
make sure deserializer doesn't add op::Result twice (#714) · 0cad670b
Nick Korovaiko authored 6 years ago
```
* make sure deserializer doesn't add op::Result twice
```
0cad670b

Pruthvi/bn inference (#670) · 5394ad2d

Pruthvi authored 6 years ago

* Added new ctor for bn which supports Inference
- added mkldnn emitter code for bn inference
* Added test case for bn inference
- added support for layout propogation for bn inference
* added sanity checks for gamma, beta, mean, variance shape in bn
* added serializer support for bn inference

5394ad2d

Dot op that can handle more than 2D on GPU (#645) · 6ebc3c8c
Fenglei authored 6 years ago
```
* general dot for gpu
```
6ebc3c8c

Add reduce sum to the GPU transformer (op::Sum) (#671) · bae77590

Chris Sullivan authored 6 years ago

* Current cudnn implementations use only
a single dimension for the ngraph tensor data (width).
In this case the tensor format should be set to

CUDNN_TENSOR_NCHW

so that adjacent memory accesses are coalesced (stride=1 for width).

* * Added some kernel emitter helpers that are reused often.
* Renamed EmitElementwise -> emit_elementwise to match emit<T>.
* op::Sum now handles trivial case of dim(input_tensor) = dim(output_tensor)
  by performing a memcpy as no axes are reduced.

*   Added general case for Nd descriptors which is used when the tensor
  has more than 4 dimensions. Currently a naive reduce is performed,
  in the future a coordinate transformation could be performed to
  improve the memory layout for the reduction.

* Switched to codegen::CodeWriter::block_begin/end.
It appears that CodeWriter::block_begin/end is not frequently used for emitters (in cpu and gpu transformers)
because a block comment is often desired. To this end I added prefix/suffix default parameters to CodeWriter::block_begin/end
so that this functionality is captured.

bae77590

Add op::ReluBackprop to GPU transformer (#712) · 72f4d661
Chris Sullivan authored 6 years ago
```
* Added backprop op for relu and enabled tests.
```
72f4d661

21 Mar, 2018 4 commits
- Eliminate unnecessary Convert ops when input element type is the same as output element type (#709) · ff6c525a
  Jayaram Bobba authored 6 years ago
  
  ff6c525a
- CallFrame order (#702) · 12876342
  Yixing Lao authored 6 years ago
```
Adjust CallFrame argument order to match Function
```
  12876342
- Directory rename (#701) · 6b0b64b4
  Robert Kimball authored 6 years ago
```
* rename directories to be consistent
* rename reference namespace to match directory
```
  6b0b64b4
- CPU: Eliminate trivial sum reductions (#703) · 47ca008a
  Jaikrishnan Menon authored 6 years ago
  
  47ca008a
20 Mar, 2018 6 commits

rename to nnp (#688) · bb831262

Sandeep authored 6 years ago

* topolotical-sort based node clustering

* cmake builds

* Argon manager renamed to NNP along with placement

* nnp dir cmake changes

* tests pass

* more renames

* somemore renames

* reslove redefination

* revert to ARGON_API

* more PR comments and remove nnp-fusion tests as redundant

* update path

* fix format

bb831262

Add batch_norm.hpp to ngraph.hpp (#693) · 1447b578
Adam Procter authored 6 years ago

1447b578

CPU nan/inf tensor validation (#553) · 2d66e349

Nick Korovaiko authored 6 years ago

* global tracing

* fix compiler errors

* nan/inf validation

* 0644 on mkldnn_utils.cpp

* address Bob's feedback

* 0755 -> 0644

* remove format changes to python dir

2d66e349

Add `--visualize` to nbench (#679) · f076fea9
Nick Korovaiko authored 6 years ago
```
* add visualize option to nbench

* check for dot, amend help msg
```
f076fea9
Fix a segfault while printing shapes for multi-output ops in VisualizeTree (#677) · 2e1823fe
Nick Korovaiko authored 6 years ago
```
* fix a segfault while printing shapes for multi-output ops
```
2e1823fe
update GraphRewrite API (#686) · fc9018dc
Nick Korovaiko authored 6 years ago

fc9018dc

19 Mar, 2018 3 commits
- add edge labels (#678) · db6419de
  Nick Korovaiko authored 6 years ago
  
  db6419de
- topolotical-sort based node clustering (#615) · 7177a0b4
  Yixing Lao authored 6 years ago
  
  7177a0b4
- apply correct copyright header (#685) · 32447416
  Robert Kimball authored 6 years ago
  
  32447416
18 Mar, 2018 1 commit

[v0.1.0] Multi-output fprop_cache tentative fix (#657) · 995671ae

Nick Korovaiko authored 6 years ago

Contains multiple fixes to GetOutputElement, BatchNorm, autodiff, fprop_cache to integrate multi-output batchnorm and fprop_cache

995671ae

17 Mar, 2018 1 commit
- Hack to help MKLDNN avoid ref convolution (#669) · 2c2de707
  Jayaram Bobba authored 6 years ago
  
  2c2de707
16 Mar, 2018 2 commits
- CPU: #include what we actually use (#659) · 8d52111c
  Jai Menon authored 6 years ago
  
  8d52111c
- Optimized Pad (#658) · e14c0565
  Jai Menon authored 6 years ago
```
* CPU: Eigen-based Pad kernel

* CPU: Create a global Eigen thread pool and use it for padding

* Formatting fixes
```
  e14c0565
15 Mar, 2018 5 commits
- set the EIGEN_MPL2_ONLY flag and add tests to make sure it is set for compile and codegen (#655) · aa3815c5
  Robert Kimball authored 6 years ago
  
  Unverified
  
  aa3815c5
- CPU: Remove commented out code (#651) · b6f2f7f9
  Jai Menon authored 6 years ago
  
  b6f2f7f9
- Move get output elimination pass prior to liveness analysis (#649) · 8fc9dd69
  Jayaram Bobba authored 6 years ago
  
  8fc9dd69
- fixed conv+bias pattern match causing mxnet tests to fail. (#647) · 19899f4d
  Louis Feng authored 6 years ago
  
  19899f4d
- add compile benchmark (#635) · 3ea55bb5
  Robert Kimball authored 6 years ago
```
* add compile benchmark

* add help when error
```
  3ea55bb5
14 Mar, 2018 5 commits

GetOutputElement Elimination (#644) · f10022cc

Nick Korovaiko authored 6 years ago

* rough draft but needs to use get_n to get the right input

* v2 fully working but hacky

* remove hacks ; switch back build_users() to users()

* rollback hacks to node.cpp

* perms, remove prints, format

f10022cc

Added op::Relu and op::Not to GPU transformer and enabled corresponding tests (#641) · e7abc0f3
Chris Sullivan authored 6 years ago
```
* Added op::Relu and op::Not and enabled corresponding tests.

* Removed softmax for now.
```
e7abc0f3
gpu add onehot op (#638) · a86a9050
Fenglei authored 6 years ago
```
* add onehot op

* refactor broadcast and onehot op
```
a86a9050

Add missing cudaFree to emitted pool_base_ptr memory buffer (#640) · 9d89ffb9

Chris Sullivan authored 6 years ago

* Added corresponding cudaFree to the cudaMalloc for the cuda pool_base_ptr memory buffer.

* Check for temporary buffer allocation prior to freeing. Add null check on cudaFree.

9d89ffb9

Yet another serialization option (#619) · 28602f31

Robert Kimball authored 6 years ago

* Add cpio file read/write class and unit tests

add reserializer

Add unit test for serialize constants to cpio file. Fix bug in serializer if function has no parameters.

28602f31

13 Mar, 2018 6 commits
- Optimize batchnorm backprop by propagating input layouts instead of delta layouts · 1bc38b12
  Jayaram Bobba authored 6 years ago
  
  1bc38b12
- make stuff const (#625) · 6a1b2ee2
  Robert Kimball authored 6 years ago
  
  6a1b2ee2
- Cleanup the output descriptor of node arguments (#639) · 64a105db
  Jayaram Bobba authored 6 years ago
  
  64a105db
- GPU elementwise emitters now respect input and output tensor types. (#633) · d3ea93e2
  Chris Sullivan authored 6 years ago
```
* GPU elementwise emitters now respect input and output tensor types.
This enables the use of binary comparison ops and op::Convert.

* Removed comments.

* All kernels now have type signature
even if the i/o tensors are equivalent type so that
kernels for specific type tensors are unique.

NGMX-391 #close 
```
  Unverified
  
  d3ea93e2
- Fix bn constructor (#631) · 429eae9a
  Pruthvi authored 6 years ago
```
* Fix bn construtor
    - assert if gamma or beta dont have rank 1
    - remove redundant checks

* - added gaurds to check if the input and delta shape to mkldnn bn fprop and bprop op has a rank of 4
```
  Unverified
  
  429eae9a
- Updated gpu cpp files with consistent use of namespaces (cosmetic) (#629) · b5467550
  Chris Sullivan authored 6 years ago
```
* Updated namespace use in cpp files.
```
  Unverified
  
  b5467550