Commits · 15da6cfed34d199d2718b78b8624a6fa5328d352 · submodule / ngraph

05 Oct, 2018 9 commits

Changes to make Klocwork a little happier (#1739) · 15da6cfe

Robert Kimball authored Oct 05, 2018

* address klocwork issue

* move class init

* more klocwork

* more klocwork

* more klocwork

* comment on where the magic number is from

* address review comments

* address review comments

15da6cfe

RNN fusion (inference) (#1459) · 4df5ea8b

Chris Sullivan authored Oct 05, 2018

* Add op::Sigmoid to nvgpu.

* Bring rnn fusion and concat passes over into GPU from IA. This is a temporary move until generalization and gpu specification can occur.

* Add LSTM fusion and cudnn inference kernel. Next need recurrent fusion and layer fusion.

* Formatting

* Removed unecessary extra output from LSTM op (rnn with seq. length = 1, so y = hy).

* Add RNN fusion of LSTM cells within a recurrent layer.

* Formatting.

* Add fusion across RNN layers.

* Formatting.

* Add algebraic simplification.

* Added rnn fusion tests.

* Updated conditional on LSTM fusion to better distinguish bound nodes as ht vs xt.

* Formatting.

* Removed print statements.

* Formatting.

* Committing missing file.

* Remove concat inputs pass and mkldnn references.

* fix cmake paths

* conflict resolution with merge from master.

* remove explicit lstm op support. bare LSTM ops are converted to RNN ops for emission.

* Formatting.

* Use NGRAPH_ASSERT. Formatting of intel copyright.

* Add check on the feature size (shape) of the recurrent (hidden) input and cell state, to ensure they are the same size.

* fix wrong rnn header

* Formatting.

* Add back lstm op to dispatch table.

* Added RNN test which shows cudnn rnn kernel is not producing correct results.

* With update to AlgSimpl. to simplify concat-reshape-slice, the check modifed in this commit needed to be relaxed.

* Bug fix in parameter tensor packing.

* Alias third output element of RNN for cell state (bug fix).

* Resolve numerical correctness issue with negative values in RNN (bug fix).
Add minimal test to evaluate LSTM and compare with values calculated by hand.

* Add tensor parameter sizes to kernel hash as
they are kernel-specific.

* Add 2 layer lstm fusion test against by-hand solution.

* Export param concatenation to graph for cudnn kernel at both the single rnn layer and multi-layer.

* Formatting.

* Finishing touches after merge: add support for macro expansed dispatch via op_tbl.

* Simplify macro support for gpu ops.

* Add CUDNN_VERSION >= 7200 defguards for RNN fusion.
Need to decide how to notify user of increased performance with >= 7200.

* Revert lstm_analytic test to explicitly copy data to tensor params.

* Removed namespace arg from NGRAPH_GPU_OP.

* Refactored macros to different header so op_tbl only contains op list.

* Defguard on cudnn_descriptor<cudnnRNNDataDescriptor_t>.

* doubles -> floats

* Reorg. pass asserts, prepare to replace with non-throwing pass failures.

* Remove Lstm op and replace it with Rnn.

* Format

* Utilize RETURN_IF_FALSE in rnn pass to avoid any RT asserts.
Note that falling back to raw (no passes) graph for 2rnn_3lstm json from mxnet models
results in a double free inside of the memory layout pass. Appears to be a bug
in Reshape pass through.

* Removed print statements. Add check on input data and recurrent data.

* Don't reuse memory for non-destructive ops.

* Add back Rnn test.

* Formatting.

* Clean up comments.

* Update test per review comments.

4df5ea8b

Add asserts to reference to make sure we don't overshoot iterators (#1757) · f04503b6

Adam Procter authored Oct 05, 2018

* Add some asserts to make sure we don't overshoot certain iterators in the reference kernels

* Add missing assertion.hpp include

f04503b6

IntelGPU backend: Broadcast bug fix: (output_shape.at(0) == 1) doesn't mean that… · d9dfaeb8
dmyershov authored Oct 05, 2018
```
IntelGPU backend: Broadcast bug fix: (output_shape.at(0) == 1) doesn't mean that it is scalar (#1754)
```
d9dfaeb8
Properly support global stats in BN (#1753) · adb38ab4
Chris Sullivan authored Oct 05, 2018
```
* global stats fix

* Formatting.
```
adb38ab4
address klocwork number overflow issue (#1751) · 3d21f6ed
Robert Kimball authored Oct 05, 2018
```
* address klocwork number overflow issue

* one more issue
```
3d21f6ed
address klocwork issues (#1750) · be0a9f03
Robert Kimball authored Oct 05, 2018

be0a9f03
address klocwork issue (#1747) · 9f26b7e9
Robert Kimball authored Oct 05, 2018

9f26b7e9

Partial Shapes, Part 2: Adapt Tensor class to have partial shapes (#1718) · a0be5231

Adam Procter authored Oct 05, 2018

* Adapt Tensor class to have partial shapes

* Add PartialShapes to Input, Output, Function, Node classes

* Terminological cleanup

a0be5231

04 Oct, 2018 8 commits
- Add Quantized conv+bias (#1703) · 780b56bf
  Nishant Patel authored Oct 04, 2018
```
* Add conv+bias

* Add test case for QuantizedConv2DWithBiasAndRelu and address feedback
```
  780b56bf
- address potential double free memory (#1734) · a432b1a7
  Robert Kimball authored Oct 04, 2018
  
  a432b1a7
- nvgpu maxpool bug fix (#1741) · 0051f201
  Fenglei authored Oct 04, 2018
```
* add a test failed on gpu, pass on cpu

* fixed bug

* get datatype size

* add descript for test

* update comment

* update comments and name
```
  0051f201
- Show types in VisualizeTree (#1733) · 2b289df0
  Nick Korovaiko authored Oct 04, 2018
```
* show types in visualize_tree

* fix a warning

* address Bob's feedback
```
  2b289df0
- address klocwork issue (#1749) · 4c15371e
  Robert Kimball authored Oct 04, 2018
  
  4c15371e
- disabel rnn passes (#1743) · ed0855fe
  Pruthvi authored Oct 04, 2018
  
  ed0855fe
- fix resize bug (#1745) · 39e4baa4
  Fenglei authored Oct 04, 2018
  
  39e4baa4
- make sure visualize_tree always uses a details map (#1744) · 87348cab
  Nick Korovaiko authored Oct 04, 2018
  
  87348cab
03 Oct, 2018 3 commits

add doctools js from basic theme sphinx repo (#1735) · 58df83cf

L.S. Cook authored Oct 03, 2018

* add doctools js from basic theme sphinx repo

* fixes from PR 672 RTD theme regarding sphinx build

58df83cf

IntelGPU backend: Operation Reduce implemented (#1736) · cae66197
shssf authored Oct 03, 2018
```
* IntelGPU backend: Operation Reduce implemented

* PR1736. Style fixed
```
cae66197

cublas emitter for NVGPU backend (#1705) · 7ac35345

Ayan Moitra authored Oct 03, 2018

* cublas emitter

* clang format fixes

* Initial comment incorporation from Chris

* Chris's If-else change comment incorporation

* incorporating Bob's comments phase 1

*  Remove unnecessary headers in cublas emitter hpp & cpp (as per Bob's comments)

* clang format on previous commit

* incorporate fenglei's refactoring comment

* incorporating comments

* Incorporate Chris's final comment

* All comments resolved

* Resolve Geoff's comments

* Change cache_primitive to register_primitive

7ac35345

02 Oct, 2018 9 commits
- IntelGPU backend: Datatype workaround for NCF model (#1729) · 0e008cc5
  shssf authored Oct 02, 2018
  
  0e008cc5
- print the entire path of the file printing relative to the project root (#1723) · 8c82136a
  Robert Kimball authored Oct 02, 2018
  
  8c82136a
- Merge pull request #1692 from NervanaSystems/aprocter/partial-shape · bcddc600
  Adam Procter authored Oct 02, 2018
```
Partial Shapes, Part 1: Classes for partially known shapes, possibly unknown dimensions
```
  bcddc600
- Merge branch 'master' into aprocter/partial-shape · 7277f0e8
  Adam Procter authored Oct 02, 2018
  
  7277f0e8
- Delete a parenthetical comment from CODEOWNERS (#1727) · c7183e46
  Adam Procter authored Oct 02, 2018
```
Pretty sure at this point that I was reading the docs correctly.
```
  c7183e46
- Add stdexcept include to dimension.hpp, to make CentOS happy · 56969288
  Adam Procter authored Oct 02, 2018
  
  56969288
- Merge remote-tracking branch 'origin/master' into aprocter/partial-shape · 3a55ce36
  Adam Procter authored Oct 02, 2018
  
  3a55ce36
- Pruthvi/rnn fusion (#1677) · 18e41513
  Pruthvi authored Oct 02, 2018
```
* WIP input * weights rnn optimization

* concat + slcing + replacing new node works

* WIP unit test case of fusing rnn inputs

* - Added unit test case for fusing rnn input weights
- registered CPURnnMatFusion_v1/v2 in codegen and DEX

* fixed redeclaration of a variable

* Refactored rnn input traformation passes into a single pass

* Refactored CPURnnMatFusion call back functions

* change random generator range to include -ve values in unit test

* address PR comments

* dont fuse if the shape of the data slices dont match
```
  18e41513
- Merge branch 'master' into aprocter/partial-shape · 3cef466f
  Adam Procter authored Oct 02, 2018
  
  3cef466f
01 Oct, 2018 7 commits
- Rename GPU_TensorView to GPUTensor (#1722) · ee712ae8
  Robert Kimball authored Oct 01, 2018
```
* rename GPU_TensorView to GPUTensor and GPUTensorViewWrapper to GPUTensorWrapper

* undo bad search/replace

* revert change
```
  ee712ae8
- Merge branch 'master' into aprocter/partial-shape · 01b94186
  Adam Procter authored Oct 01, 2018
  
  01b94186
- GPU External Function cleanup (#1698) · 943b167f
  Robert Kimball authored Oct 01, 2018
```
* cleanup

* cleanup header includes

* cleanup

* cleanup TensorMemoryReservation pass

* include cleanup

* more cleanup

* more header cleanup

* style

* Remove obsolete comments
```
  943b167f
- add find algorithm for convolution without extra padding (#1710) · d38aba91
  Fenglei authored Oct 01, 2018
  
  d38aba91
- Review comment · f03b2c20
  Adam Procter authored Oct 01, 2018
  
  f03b2c20
- Sigmoid and tanh doc, edit for abs (#1704) · 2aa7899c
  Scott Cyphers authored Oct 01, 2018
```
* Sigmoid and tanh  doc, edit for abs

* Add equation for sigmoid.
```
  2aa7899c
- Add CODEOWNERS file (#1659) · f81917b1
  Adam Procter authored Oct 01, 2018
```
* Add CODEOWNERS file (will have no effect until enabled in GitHub settings)

* Review comments, and fix a username

* Tabs -> spaces

* Review comments

* /maint/ to @cconvey

* /maint/ back to @diyessi by default
```
  f81917b1
30 Sep, 2018 1 commit
- initialize all int32_t with 0 or 1 to make OneHot happt (#1720) · 7c1ab3ef
  Robert Kimball authored Sep 30, 2018
  
  7c1ab3ef
29 Sep, 2018 2 commits
- Rename runtime::TensorView to runtime::Tensor (#1699) · 5fc7cf65
  Robert Kimball authored Sep 29, 2018
```
* rename files

* rename runtime TensorView to Tensor

* rename HostTensorView to HostTensor
```
  5fc7cf65
- Address deferred comments from PR 1676 (#1689) · 7b9bf2a8
  Robert Kimball authored Sep 29, 2018
```
* Address deferred comments from PR 1676

* use dynamic pointer cast for added error checking
```
  7b9bf2a8
28 Sep, 2018 1 commit
- set_output_size shouldn't reset existing outputs (#1694) · 53c4b3ce
  Nick Korovaiko authored Sep 28, 2018
```
* set_output_size fix

* add assert

* dont run get_loop_kernels twice
```
  53c4b3ce