Commits · v0.9.0-rc.2 · submodule / ngraph

12 Oct, 2018 5 commits

Merge remote-tracking branch 'origin/master' into r0.9 · 36dd64ad
Adam Procter authored Oct 12, 2018

36dd64ad

Robert Kimball authored Oct 12, 2018

* update test to verify all header files are complete, meaning they include what they use.

* disable

c579e245

Support ArgMin and ArgMax for NVGPU Backend (#1737) · 6f30b32b

Ayan Moitra authored Oct 12, 2018

* Project initialization commit

* Added unit tests for 3D tensors for argmax

* Refactored reduce to be used by argmax argmin. argmax argmin still has some issues. WIP

* [WIP]First working version of ArgMax ArgMin

* added reduce buffer for the cudnn api calls

* added reduce buffer for the cudnn api calls

* Further modifications. Using rvalues to pass enums to build reduce method

* more unit tests added

* Incorporate Fenglei's comments

* Incorporating Chris's first set of comments

* small change to test file

* Resolving clang issue that was causing argmin test to fail

* Incorporate Chris's  comments

* clang format issue

6f30b32b

onnx: fix ONNX Python build (#1810) · 2f49032f
Artur Wojcik authored Oct 12, 2018
```
Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
```
2f49032f
Modify slice layout. (#1788) · d19d1271
Amy Zhuang authored Oct 12, 2018

d19d1271

11 Oct, 2018 3 commits
- make sure that data/weights/add are all the right shapes (#1805) · b4338a52
  Nick Korovaiko authored Oct 11, 2018
  
  b4338a52
- make sure Slice is reshaped if needed (#1803) · 7277a9fd
  Nick Korovaiko authored Oct 11, 2018
  
  7277a9fd
- updated unit tests (#1760) · b339ea71
  Robert Kimball authored Oct 11, 2018
```
* updated unit tests

* remove debug comments
```
  b339ea71
10 Oct, 2018 3 commits

add back missing part (#1785) · a41c1baa
Fenglei authored Oct 10, 2018

a41c1baa

nvgpu one hot update (#1773) · 6cd35432

Fenglei authored Oct 10, 2018

* update onehot

* clang

* fix bugs

* format

* add output_datatype_size to hash

* typo

* hash

6cd35432

Reshape Sinking (#1701) · f642bc4c

Nick Korovaiko authored Oct 10, 2018

* reshape sinking working on mnist_conv

* forgot to add reshape_sinking files

* refactoring of binary case

* Quantize/Dequantize case, fix add case, add assert

* address bob and scott's feedback

* debug

* fix a bug where reshapes are removed too early

f642bc4c

09 Oct, 2018 4 commits

Update Sphinx version number to v0.9 (#1776) · edc40856
Adam Procter authored Oct 09, 2018

edc40856
enable some unit tests that were disabled. (#1766) · 369b95e3
Robert Kimball authored Oct 09, 2018

369b95e3

cudnnFind/Get interoperability (#1721) · 20a01781

Chris Sullivan authored Oct 09, 2018

* add find algorithm for convolution without extra padding

* Use cudnnFind* or cudnnGet* depending on tuning param boolean. Add select function to search the perf results of the cudnn queries.

* Formatting.

* Algo search no longer binary, now it is either off, a heuristic search (cudnnGet*) or an explicit search (cudnnFind*).

* Formatting.

* switch to explicit.

* Throw if no suitable cudnn algo found.

* Formatting

* Remove comment.

20a01781

IAT: Add bounds check for MKLDNN layout creation (#1769) · f2f42fa9
Jayaram Bobba authored Oct 09, 2018
```
* Added a bounds check for mkldnn layout descriptor creation

* Added dims check
```
f2f42fa9

08 Oct, 2018 5 commits

Update Sphinx version number to v0.9 · 413e9617
Adam Procter authored Oct 08, 2018

413e9617
optimize operator== (#1765) · c5f0bd9d
Robert Kimball authored Oct 08, 2018

c5f0bd9d

Update pad on nvpgu (#1759) · 40ff77bd

Chris Sullivan authored Oct 08, 2018

* Add pad with fill operator using the outward-in index pattern.

* Remove static pad and rename build_pad_dynamic -> build_pad. Update maxpool 1d padding.

* Formatting.

* Split build_pad_dynamic into build_pad and build_pad_fill.

* Add test coverage for fixed bug in op::Pad for gpu.

40ff77bd

IAT: Skip reshapes that are removing or adding size-1 dimensions (#1684) · 519b18ac

Jayaram Bobba authored Oct 08, 2018

* Reshape optimizations for when unit-sized dimensions are added/removed from tensors

* Added unit tests for eliminating squeeze and expand_dims operations

* Bug fix to expand dims layout

* Style fix

519b18ac

IAT: More convolution folding optimizations (#1712) · 00b4453d

Jayaram Bobba authored Oct 08, 2018

* Check output shape when setting memory layout for slice op.

* Miscellaneous fusion and other optimizations for inception-resnetv2
- ConvBias Batchnorm folding
- ConvBias Affine folding
- Check if MKLDNN can slice a given layout and select layouts
  appropriately

* Fixed unit test and bug in conv bias pattern

* Addressed PR feedback

* Addressed PR feedback

00b4453d

06 Oct, 2018 2 commits
- Eliminated two warnings introduced in #1459 (#1761) · c6bb0cf4
  gcwenger authored Oct 06, 2018
```
* Eliminated two warnings introduced in #1459

* Removed unnecessary call to reserve_workspace.
```
  c6bb0cf4
- added constant folding for dequantize (#1762) · c16d65c4
  VINOD KUMAR DEVARAMPATI authored Oct 06, 2018
```
* added constant folding for dequantize

* modified as per review comments
```
  c16d65c4
05 Oct, 2018 13 commits

Support LRN for NVGPU Backend (#1740) · fe06f325

gcwenger authored Oct 05, 2018

* LRN WIP

* Explicit lambda captures.

* Switched to Ayan's new caching routine.

* Remove commented out lrn from manifest.

* Fixed clang 3.9 error.

* Corrected lrn hash. Only call cudnnSetLRNDescriptor once.

* Simplified lrn hash. Removed redundant parameters. No longer passing CUDNN_LRN_CROSS_CHANNEL_DIM1 as parameter because it's the only choice for cudnnLRNCrossChannelForward.

fe06f325

CPU: Make DEX mode the default (#1755) · c8858ef2
Jaikrishnan Menon authored Oct 05, 2018

c8858ef2
Cyphers/doc1 (#1758) · 0e6c9c26
Scott Cyphers authored Oct 05, 2018
```
* More op doc, fix formatting

* sqrt, tan

* Formatting.
```
0e6c9c26
address klocwork issue (#1748) · 0920ed1c
Robert Kimball authored Oct 05, 2018

0920ed1c

Changes to make Klocwork a little happier (#1739) · 15da6cfe

Robert Kimball authored Oct 05, 2018

* address klocwork issue

* move class init

* more klocwork

* more klocwork

* more klocwork

* comment on where the magic number is from

* address review comments

* address review comments

15da6cfe

RNN fusion (inference) (#1459) · 4df5ea8b

Chris Sullivan authored Oct 05, 2018

* Add op::Sigmoid to nvgpu.

* Bring rnn fusion and concat passes over into GPU from IA. This is a temporary move until generalization and gpu specification can occur.

* Add LSTM fusion and cudnn inference kernel. Next need recurrent fusion and layer fusion.

* Formatting

* Removed unecessary extra output from LSTM op (rnn with seq. length = 1, so y = hy).

* Add RNN fusion of LSTM cells within a recurrent layer.

* Formatting.

* Add fusion across RNN layers.

* Formatting.

* Add algebraic simplification.

* Added rnn fusion tests.

* Updated conditional on LSTM fusion to better distinguish bound nodes as ht vs xt.

* Formatting.

* Removed print statements.

* Formatting.

* Committing missing file.

* Remove concat inputs pass and mkldnn references.

* fix cmake paths

* conflict resolution with merge from master.

* remove explicit lstm op support. bare LSTM ops are converted to RNN ops for emission.

* Formatting.

* Use NGRAPH_ASSERT. Formatting of intel copyright.

* Add check on the feature size (shape) of the recurrent (hidden) input and cell state, to ensure they are the same size.

* fix wrong rnn header

* Formatting.

* Add back lstm op to dispatch table.

* Added RNN test which shows cudnn rnn kernel is not producing correct results.

* With update to AlgSimpl. to simplify concat-reshape-slice, the check modifed in this commit needed to be relaxed.

* Bug fix in parameter tensor packing.

* Alias third output element of RNN for cell state (bug fix).

* Resolve numerical correctness issue with negative values in RNN (bug fix).
Add minimal test to evaluate LSTM and compare with values calculated by hand.

* Add tensor parameter sizes to kernel hash as
they are kernel-specific.

* Add 2 layer lstm fusion test against by-hand solution.

* Export param concatenation to graph for cudnn kernel at both the single rnn layer and multi-layer.

* Formatting.

* Finishing touches after merge: add support for macro expansed dispatch via op_tbl.

* Simplify macro support for gpu ops.

* Add CUDNN_VERSION >= 7200 defguards for RNN fusion.
Need to decide how to notify user of increased performance with >= 7200.

* Revert lstm_analytic test to explicitly copy data to tensor params.

* Removed namespace arg from NGRAPH_GPU_OP.

* Refactored macros to different header so op_tbl only contains op list.

* Defguard on cudnn_descriptor<cudnnRNNDataDescriptor_t>.

* doubles -> floats

* Reorg. pass asserts, prepare to replace with non-throwing pass failures.

* Remove Lstm op and replace it with Rnn.

* Format

* Utilize RETURN_IF_FALSE in rnn pass to avoid any RT asserts.
Note that falling back to raw (no passes) graph for 2rnn_3lstm json from mxnet models
results in a double free inside of the memory layout pass. Appears to be a bug
in Reshape pass through.

* Removed print statements. Add check on input data and recurrent data.

* Don't reuse memory for non-destructive ops.

* Add back Rnn test.

* Formatting.

* Clean up comments.

* Update test per review comments.

4df5ea8b

Add asserts to reference to make sure we don't overshoot iterators (#1757) · f04503b6

Adam Procter authored Oct 05, 2018

* Add some asserts to make sure we don't overshoot certain iterators in the reference kernels

* Add missing assertion.hpp include

f04503b6

IntelGPU backend: Broadcast bug fix: (output_shape.at(0) == 1) doesn't mean that… · d9dfaeb8
dmyershov authored Oct 05, 2018
```
IntelGPU backend: Broadcast bug fix: (output_shape.at(0) == 1) doesn't mean that it is scalar (#1754)
```
d9dfaeb8
Properly support global stats in BN (#1753) · adb38ab4
Chris Sullivan authored Oct 05, 2018
```
* global stats fix

* Formatting.
```
adb38ab4
address klocwork number overflow issue (#1751) · 3d21f6ed
Robert Kimball authored Oct 05, 2018
```
* address klocwork number overflow issue

* one more issue
```
3d21f6ed
address klocwork issues (#1750) · be0a9f03
Robert Kimball authored Oct 05, 2018

be0a9f03
address klocwork issue (#1747) · 9f26b7e9
Robert Kimball authored Oct 05, 2018

9f26b7e9

Partial Shapes, Part 2: Adapt Tensor class to have partial shapes (#1718) · a0be5231

Adam Procter authored Oct 05, 2018

* Adapt Tensor class to have partial shapes

* Add PartialShapes to Input, Output, Function, Node classes

* Terminological cleanup

a0be5231

04 Oct, 2018 5 commits
- Add Quantized conv+bias (#1703) · 780b56bf
  Nishant Patel authored Oct 04, 2018
```
* Add conv+bias

* Add test case for QuantizedConv2DWithBiasAndRelu and address feedback
```
  780b56bf
- address potential double free memory (#1734) · a432b1a7
  Robert Kimball authored Oct 04, 2018
  
  a432b1a7
- nvgpu maxpool bug fix (#1741) · 0051f201
  Fenglei authored Oct 04, 2018
```
* add a test failed on gpu, pass on cpu

* fixed bug

* get datatype size

* add descript for test

* update comment

* update comments and name
```
  0051f201
- Show types in VisualizeTree (#1733) · 2b289df0
  Nick Korovaiko authored Oct 04, 2018
```
* show types in visualize_tree

* fix a warning

* address Bob's feedback
```
  2b289df0
- address klocwork issue (#1749) · 4c15371e
  Robert Kimball authored Oct 04, 2018
  
  4c15371e